Series Week 3 - SQL & Databases: What Does It Really Mean To Work With Data In Tech?
Hi everyone! Thanks for coming back for another week of the Talk Like A Developer series. This week I’ll introduce you to the most common tools used on the data analytics side of computer science: SQL and Databases.
About this series: This series isn’t going to give you everything you need to roll up your sleeves and start coding. What it will do is give everyone, even the people who have no idea what computer science is, the ability to talk and ask questions about these topics so that if you are interested in getting into the hands-on/technical side of these topics, you’ll know where to start, what to Google, or how to ask for help.
If you haven’t already, check out last week’s post: Week 2 - Languages and IDEs as well as Key Coding Terminology For Beginners to get up to speed on the common terminology used in this series!
What is a database:
Databases are basically HUGE electronic filing cabinets that are very well organized and can be searched through in seconds. Most of the time, if you were to look at a database online, you would see a table (or multiple tables) containing information with various rows, columns, and values.
For example, you may have heard of IMDb (Internet Movie Database) or IMDb ratings for movies. The tables in IMDb have columns for information such as movie title, year, actors, rating, genre, etc. Storing data (information) in a table like this allows a computer to quickly and easily sort and/or search for specific data based on criteria you may provide.
Let’s say I wanted to find a list of all Leonardo Dicaprio movies from the last 20 years. Instead of searching every movie to see if Leo was in it and if it was from the last 20 years, we can search through a table quickly by filtering on the actor column for Leo and filtering on the year column for any year in the last 20 years.
When a data analyst wants to do search and filter actions like this on their company’s databases, they will use a programming language called SQL (most programmers say “see-kwl”, not “S-Q-L”).
What is SQL:
Structured Query Language (SQL) is a querying language used to query data from a database. There are other querying languages such as XQuery, OQL, GraphQL, LINQ, etc, but SQL is the most common/basic.
SQL is pretty simple and easy to learn, but you can do some pretty fancy commands with it. A basic SQL query (command) is a line of code stating which table you want to look into, which columns from that table you want to look into, and any other specifications about the data you want to provide to narrow down your search. Once you execute the query (like hitting “search” on Google), the result you see is a consolidated table with only the information matching the criteria you provided in the query.
So far we have 2 pieces: the database of information (data) and the language used to look through the database. So how do we connect them?
Data analysts will use something called a Database Management System (or DMBS), which is basically just software that allows you to view a database in an organized way. Within a DMBS, there is functionality for you to create a SQL file, connect it to the database (which most DMBSs will do for you automatically), and then type and execute SQL queries on that database.
If you were using Microsoft’s DMBS (called SQL Server Management Studio or SSMS) you would see something like this:
On the left side of the screen you can see the folders containing databases, and whichever database you have selected (by clicking on it) is the one that your SQL query will look through. The window on the top middle of the screen is a SQL file where you would actually write query (search) information and criteria. In the example in this image, it looks like they are searching for the top 1000 results within a Resident database and they want the results to include the information in the columns with ID, name, address, etc.
The window on the bottom middle of the screen is the results window. Once you execute your query (if you were using SSMS, you would hit that “Execute” button at the top of the screen), a results window pops up at the bottom like you see above and, as I mentioned earlier, it contains a consolidated version of the whole database, only including the information matching the criteria provided.
While all Database Management Systems look a bit different, they all have the same basic functionality you saw above: a window for database organization, a window for querying (using one of the querying languages), and a results window for displaying the results of your query.
FINAL THOUGHTS:
Using query results, data analysts can do things like getting side-by-side comparisons of different data, find missing/invalid information that needs to be addressed, find patterns or irregularities in their data that might be hard to notice on the surface, and so much more.
Is data science an area of computer science you might be interested in? Let me know in the comments below!
Thank you so much for reading and come back next Monday for another Talk Like A Developer post! If you have any questions, feel free to comment them below or email our technology blog directly at techblog@apartfromblonde.com!
Check out this and more posts on my Medium account HERE!