malyj_gorgan | Mar. 20th, 2021

Відповідь до вчорашньої задачки. Основні проблеми виловив sassa_nf:
SELECT some_date, COUNT(DISTINCT x) as num_x, COUNT(DISTINCT z) FROM t3 WHERE some_date > some_start_date AND some_date < some_end_date AND x IN ( SELECT x FROM t1 WHERE SUBSTRING(y, 1, 11) = 'some_string' AND LOWERCASE(x) LIKE '%some_other_string%' AND LOWERCASE(x) NOT LIKE '%yet_another_string%' ) GROUP BY some_date ORDER BY num_x LIMIT 10
UPDATE: Якщо кому не віриться, ось варіант з JOIN'ом, трохи симетричніший на вигляд, а виконується за той самий час або трохи довше, залежно від платформи:SELECT some_date, COUNT(DISTINCT x) as num_x, COUNT(DISTINCT z) FROM (SELECT x FROM t1 WHERE SUBSTRING(y, 1, 11) = 'some_string' AND LOWERCASE(x) LIKE '%some_other_string%' AND LOWERCASE(x) NOT LIKE '%yet_another_string%') INNER JOIN (SELECT x,z,some_date FROM t3 WHERE some_date > some_start_date AND some_date < some_end_date) USING(x) GROUP BY 1 ORDER BY 2 LIMIT 10 Але верхній варіант швидший... імхо

А ось ще задачка. Трохи спрощений варіант реальної задачі, яку мені треба було розв'язати для одного з наших клієнтів, коли я лише почав працювати на попередній роботі. Там рішення, ніби, існувало, мій попередник на тій позиції написав SELECT query на кілька екранів, з усякими зайвими UNION'ами і неоптимальними ходами, яке в якийсь момент почало ламатися. Я втикав у той код, втикав... плюнув і переписав начисто, коротко і читаблеьно.
А потім я давав таку спрощену задачку на співбесідах, на пізніших стадіях, де ти годину балакаєш наживо. Троє людей її провалили, а дві кандидатки відповіли, хоч і одна і не довела до кінця. Обидвох ми прийняли працювати, обидві стали прекрасними спеціалістами, ліпшими за мене.

Словом, ось задача, про відтік юзерів у мобільного оператора:

Plot the daily user attrition rate in a mobile network operator. fraction of currently active users leaving the company's active base. The attrition rate is independent of the new user engagement, those are two independent processes (which, when combined, define the growth or decline of the company's user base). All you have is a daily imported from the operator table TBL with the user body information for each day in the last month. The three fields in the table tat you need are like this:

uid: unique user identifier, long hexidecimal string

status: user status (a single character "A", "G", "P", "Z", "D")

dt: date of the recorded status

The user status can be:

- "A" for "acive user", paying, in good standing

- "G" for user in "grace period", not active anymore yet not disabled, can be re-activated if s/he pays back

- other characters are for pre-registered users or users at some stage of being deleted, but for various reason for most users we never see those other status values, whereas any user is guaranteed to spend at least seven days in "A" status before suspension and then some time in "G", too, unless reactivating.

In terms of the status, today's attrition value is the count of users who are in status "A" today and "G" tomorrow minus a small trickle of resurrected users, those who are "G" today and "A" tomorrow. Consequently, the attrition rate is the ratio of this number to the total number of users in status "A" as of today.

Write a SQL query that would return two columns: the date and the attrition rate on that day.
You may or may not need to use a function DATEADD(datepart, interval, DATE), which adds "interval" dateparts to a date object, for example DATEADD(DAY, -2, CURRENT_DATE) is the day before yesterday.

(Я спочатку спеціально уточнив, що я на той час лише почав був працювати: я тоді не дуже знав, як працюють WINDOW фунцкії, тому боявся їх і писав усе без них (з ними було би ще на порядок простіше). Це я до того, що можна писати з вікном, а можна без вікна, з JOIN'ом.)

Апдейт:
На всякий випадок, уточню, що можливі зміни у статусі юзера є не лише між "A" і "G" . Варіантів пар статусів сьогодні-завтра у одного юзера багато, просто не всі завжди видно. Ось повний перелік: NULL-P, NULL-A, P-P, P-A, A-A, A-G, G-A, G-G, G-Z, G-C, G-NULL, C-NULL, Z-NULL. (NULL означає, що нема рядка для даного юзера в даний день)

Update 2: варіанти розв'язку в коментах

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Малий Ґорґан

Mar. 20th, 2021

Mar. 20th, 2021

Відповідь

знову SQL

Profile

January 2026

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags