Giter Site home page Giter Site logo

sql-scratch-capstone-turn-in's People

Contributors

vibiii avatar

sql-scratch-capstone-turn-in's Issues

Put each column being SELECTed on its own line

SELECT product_id, style, model_name, color, count(*) as total_purchases

You split up the columns being SELECTed onto their own lines in some places, but not in others, and it would help improve readability if you did so consistently, as this would help prevent overly long lines. For example, the above SELECT statement could be broken up like so:

SELECT product_id, 
  style, 
  model_name, 
  color, 
  COUNT(*) AS total_purchases

Changed column alias results in error

--Question 6 - Query 3 : most common results of the style quiz
SELECT style, count(*) AS total_answers
FROM quiz
GROUP BY style
ORDER BY number_answers DESC;

Note that in the SELECT statement, the count is aliased as total_answers, but in the ORDER BY statement, it is referred to as number_answers instead, resulting in an error. Be careful about using the proper names and avoiding typos like this.

Use indentation for differentiation between different parts of queries

-- Question 6 - Query 1 : overall conversion rates
WITH funnel AS (SELECT q.user_id,
CASE WHEN h.user_id IS NOT NULL THEN 'True' ELSE 'False' END AS 'is_home_try',
h.number_of_pairs,
CASE WHEN p.user_id IS NOT NULL THEN 'True' ELSE 'False' END AS 'is_purchase'
FROM
quiz AS q
LEFT JOIN home_try_on AS h
ON q.user_id = h.user_id
LEFT JOIN purchase AS p
ON h.user_id = p.user_id)
SELECT COUNT (distinct user_id) AS 'quiz', SUM (CASE WHEN is_home_try = 'True' then 1 ELSE 0 END) AS 'home_try', SUM (CASE WHEN is_purchase = 'True' then 1 ELSE 0 END) AS 'purchase'
FROM funnel;

Indentation can provide useful visual cues for when a WITH clause ends and which lines are just continuations of the same statement (for example, the different columns in a single SELECT statement or the ON lines in a JOIN). Below is a reformatted version of your query above that uses indentation to provide these visual cues (I also made some changes similar to in issue #3 to have the is_purchase and is_home_try columns be 1s and 0s initially):

 -- Question 6 - Query 1 : overall conversion rates 
WITH funnel AS (
  SELECT q.user_id, 
    h.user_id IS NOT NULL AS 'is_home_try', 
    h.number_of_pairs, 
    p.user_id IS NOT NULL AS 'is_purchase' 
  FROM quiz AS q 
  LEFT JOIN home_try_on AS h
    ON q.user_id = h.user_id 
  LEFT JOIN purchase AS p 
    ON h.user_id = p.user_id) 
SELECT COUNT (DISTINCT user_id) AS 'quiz', 
  SUM (is_home_try) AS 'home_try', 
  SUM (is_purchase ) AS 'purchase' 
FROM funnel; 

Avoid unnecessary tables and columns to increase query efficiency

-- Question 6 - Query 2 : AB Test result
WITH funnel AS (SELECT q.user_id,
CASE WHEN h.user_id IS NOT NULL THEN 'True' ELSE 'False' END AS 'is_home_try',
h.number_of_pairs,
CASE WHEN p.user_id IS NOT NULL THEN 'True' ELSE 'False' END AS 'is_purchase'
FROM
quiz AS q
LEFT JOIN home_try_on AS h
ON q.user_id = h.user_id
LEFT JOIN purchase AS p
ON h.user_id = p.user_id)
SELECT number_of_pairs, count (*) AS total_users, 1.0*sum (CASE WHEN is_purchase= 'True' THEN 1 ELSE 0 END) / count(*) AS conversion_rate
FROM funnel
WHERE number_of_pairs IS NOT NULL
GROUP BY number_of_pairs;

Since we are working with relatively small tables, efficiency isn't as much of an issue in this project, but if we were working with larger tables, we would want to make sure to avoid unnecessary JOINs and SELECTing unnecessary columns. For example, in the above query for finding the results of the AB test, we don't need the quiz table at all, as your query screens out all of the users who appear in the quiz table but not the home_try_on table. Also, instead of making the is_purchase column contain 'True' and 'False' strings in the funnel clause and then converting this into 1s and 0s when computing the conversion_rate in the main query, we can just make them 1s and 0s initially by assessing the "truthiness" of the statement p.user_id IS NOT NULL (as this will return 1 for every row where this is true and 0 for every row where it is not). Below is a revised version of your query which should run faster and return the same results:

WITH funnel AS (
  SELECT h.user_id,
    h.number_of_pairs,
    p.user_id IS NOT NULL AS 'is_purchase'
  FROM  home_try_on AS h
  LEFT JOIN purchase AS p
    ON h.user_id = p.user_id)
SELECT number_of_pairs, 
  COUNT (*) AS total_users, 
  1.0*SUM (is_purchase) / COUNT(*) AS conversion_rate
FROM funnel
GROUP BY number_of_pairs;

Warby Parker Project Summary & Rubric Score

Rubric Score

Criteria 1: Report: Are conclusions clear and supported by evidence in all answers?

  • Score Level: 4/4 (Exceeds Expectations)
  • Comment(s): You included appropriate queries and the results of those queries for each of the questions. What's more, you provided reasonable analysis of those results and used this to formulate clear, actionable recommendations. I particularly liked your analysis of the purchasing habits of people who selected black as their preferred color in the quiz, as it brings up some interesting questions. Good work!

Criteria 2: Query Accuracy: Do queries do what they were supposed to do?

  • Score Level: 3/4 (Meets Expectations)
  • Comment(s): Most of your queries retrieve the rows they are supposed to. Well done! However, their were two instances of columns being referred to by the wrong alias and one instance of a missing semi-colon, resulting in errors. These are described in more detail in issues #1 and #2. It looks like you have the data that these queries were supposed to retrieve in your report, though, so I'm not sure if these errors were introduced into your code when you were reformatting it for submission? Also, as mentioned in issue #3, you could improve the efficiency of some of your queries by not including unnecessary tables and by converting columns to the format you are looking for more directly.

Criteria 3: Query Formatting

  • Score Level: 3/4 (Meets Expectations)
  • Comment(s): Your code adheres to many of the formatting guidelines, but there were some reserved keywords that were not capitalized (see issue #4) and some SELECT clauses that should be split up into multiple lines (#5). Also, be sure to use indentation to improve readability (#6).

Criteria 4: Understanding underlying concepts and terminology

  • Score Level: 3/4 (Meets Expectations)
  • Comment(s): You demonstrated a clear understanding of the purchase funnel, but your analysis in slides 1.2 and 1.3 indicates some misconceptions about the quiz funnel. Since users have to answer each question to be able to answer the next (even if they answer it with 'Not sure. Let's skip it.'), the response rate for question 4 is actually pretty high, as while only 72% of users who started the quiz answered it, 95% (=361/380) of people who had answered question 3 (and thus were eligible to answer question 4) did in fact answer question 4. In general, since this is a funnel where people can drop out altogether after each step, the degree to which a question affects the drop out rate is better measured by the percentage of people who answered the preceding question who also answered this one, rather than the percent of people who started the quiz who also answered this question.

Overall Score: 13/16 (Meets Expectations)

Well done on this overall! Most of your queries do what they should, and the additional analysis you chose to perform posed the right sort of questions to gain insight from the available data. Do watch out for typos that can causes errors in your queries, though, and keep in mind ways to make your queries more efficient (as when working with larger tables, this can make a significant difference in runtime). Good luck as you move forward with your database querying!

Missing semi colon and changed alias result in error

SELECT count (distinct user_id), quiz_color, CASE WHEN purchase_color LIKE '%Black' THEN 'Black' ELSE 'Other' END AS purchase_black
FROM color
WHERE quiz_color = 'Black'
GROUP BY quiz_color,purchase_color

We need a semi colon at the end of this query, or else we get an error. Also, note that in the SELECT statement the alias purchase_black is used, but in the GROUP BY statement the alias purchase_color is used instead, which will cause an error (similar to issue #1).

Capitalize COUNT, SUM, ON, and AS

-- Question 6 - Query 8 : color bought by the people who answered "Black" to the color quiz and made a purchase
WITH color AS (SELECT q.user_id AS user_id, q.color AS quiz_color,p.color AS purchase_color
FROM quiz as q
LEFT JOIN purchase as p
on q.user_id = p.user_id
WHERE purchase_color IS NOT NULL)
SELECT count (distinct user_id), quiz_color, CASE WHEN purchase_color LIKE '%Black' THEN 'Black' ELSE 'Other' END AS purchase_black
FROM color
WHERE quiz_color = 'Black'
GROUP BY quiz_color,purchase_color

For the most part, you've done a good job of capitalizing the reserved keywords, but keep in mind that COUNT and SUM are reserved keywords, too. Also, make sure to consistently capitalize AS and ON as well (you do this in some places, but not, for instance, in some of the lines in the code sample above). This will make it easier to quickly scan through your queries.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.