Quick Project: Movies Database/Google Form

For a long time, I’ve wanted to create a database of the movies I have. Because I have movies scattered around on various media, I’d like to be able to point friends to a site where they can see all the movies I have, and I can look up to see what media it is available on. In my case, I have a PVR, a PVR on the Mac, some DVDs, and movies I’ve bought off Google Play.

Eventually, I realised that Google Forms just saves to a Google Sheets spreadsheet and I could use this spreadsheet to find the location, or share a link to it so my friends can pick a movie. I also figured that I could somehow script the act of responding to the Google Form or otherwise populate the data, reading the list of recordings on my Mac PVR from the filesystem.

After a bit of searching around, I came across this Reddit thread, How can I use Python to submit a Google Form (or write to a response spreadsheet)? which suggested the easiest way to submit the Google Form. With a bit of Python magic, I created a simple script to read the files and folders in a directory, and submit them straight to the Google Form!

Read on to find out how!

Creating the Google Form

Step 1.

Go to Google Forms and create a new form. Call it My Movies or whatever you like. My use case is just me filling this in or other members of my household so it doesn’t need to be fancy.

    1. starting a new Google Form
    2. empty formA: Your Form title.
      B: a description.
      C: question text.
      D: Change question type.
      E: change question to be Required.
      F. Options toolbar – add a new question, add subtitle, add a background, insert video, add section.

 

Step 2:

Add questions so it looks something like this:

my example form

For the Location question, you could choose from Dropdown or Multiple Choice Grid if you had lots of locations. (Say, several bookshelves or whatever.)

Step 3:

This step is important. You need to select a destination for your responses to a new Google Sheet so that you can share that (or analyse it further). If not, the responses are only saved inside the Form itself and they are not as easy to access. Click the green button to create a spreadsheet.

responses tab

You get this screen:

select a destination

Creating a new one is easier for our purposes, but using an existing Google Sheet is possible too.

When you click Create, Google Forms creates a new Google Sheets spreadsheet with the appropriate columns and links the form to it. Note that the data only goes one way: from the form to the spreadsheet. You can’t “create” a response by adding data to the sheet, I tried that already!

new Google Sheet And see how they are linked…it's linked to the form

And now you’re done. Because it’s your form, you’ll have to get the public URL or preview it to fill it in. (Google Forms is really designed for creating polls of course.)

You’ll need the Pre-filled Link for the next part anyway so we may as well get that now:

getting the pre-filled link

That takes you to a new page:

making the prefilled link

Where you can answer any questions that you want to prefill the answers to. For this example, I’m pretending that you give yourself or family member the task of adding all your DVDs to the “database”, so that question being pre-filled would be useful. You could make another pre-filled link with Google Play already selected later. You can imagine how in a different form you could give out different pre-filled links to different people where you already know certain responses (like their school or membership).

After you click submit, this comes up:

The whole link for this example form is:

https://docs.google.com/forms/d/e/1FAIpQLSdUzb1NOgeqIigZbaNU5JiAvIoMJNBteDGI93znQafXC3heYw/viewform?usp=pp_url&entry.495577097&entry.1315588752=DVD

I’ll explain the structure of this in the next section.

Creating the Python script

My movies are just in the root directory of the Movies folder, so my script was simple. If yours are in subfolders by genre or something you might want to modify the script. You can just use my script. Remember to update the form ID and the dictionary’s keys to be the correct form question IDs for your form.

My script using the example form.

import requests
import sys
import os
import time


# This is the "Location" question whose answer is where this
# script is being run from. It should match one of the entries in your Google
# Form.
this_location = sys.argv[2]
directory_to_scan = sys.argv[1]

# copy the form ID to here
form_id = "1FAIpQLSdUzb1NOgeqIigZbaNU5JiAvIoMJNBteDGI93znQafXC3heYw"

# modify the submission url as you need
url = f"https://docs.google.com/forms/d/e/{form_id}/formResponse"

if form_id == "":
    print("You need to update the form ID (and questions) in the source code...")
    exit()

for entry in os.listdir(directory_to_scan):
    if entry != sys.argv[0]:
        if os.path.isfile(os.path.join(directory_to_scan, entry)):
            last_dot = entry.rfind(".")
            if last_dot == -1:
                last_dot = len(entry)
        else:
            last_dot = len(entry)
        # update the dictionary to reflect your own Form Question IDs
        # (Use the pre-filled link as a guide)
        data = {"entry.495577097": entry[:last_dot],
                "entry.1315588752": this_location}
        request = requests.post(url, data)
        print(entry[:last_dot], request.status_code)
        request.close()

        # let's pause for a couple of seconds to be a nice internet citizen
        # time.sleep(2)

I use the Requests library to POST the form response to the URL as described in the Reddit thread. I verified what they were describing by inspecting the HTML of the form.

Note well that the form submits to /formResponse rather than what the prefill link has as /viewForm.

Focusing on the for loop at line 23, I basically loop over each item in the list returned by os.listdir():

I used str.rindex("s") to find the position of the last dot. With this, I could get the substring of the filename without text extension. If str.rindex("s") returned -1, I assumed it didn’t have a dot in the name. I also wanted to include those as movies too.

I don’t want to clip of the last few characters of directories since they don’t have an extension, so I used os.path.isfile() and os.path.join() to only do the above checks if the current file is a not a directory. And completely skip over if the filename is the running script – this way you can just copy the script to the directory you want to parse and give it “.” as the first parameter. Note that I actually only added this functionality after I’d ran it on real data and discovered some bugs.

Then I create a dictionary of the data to submit, which consists of the answers to the form. I used the “&entry.495577097&entry.1315588752=DVD” part of the pre-fill link to help determine element IDs being submitted. Plus I inspected the HTML of the form to determine the IDs of the HTML elements to verify my assumptions and the Reddit thread. If you type something obvious into the pre-fill link form, you can clearly see which “entry.xxxxx” goes to what question. Just like entry.1315588752 goes to Location since the pre-filled answer is DVD.

If you are concerned about what you are submitting, you can use Request Bin as your requests.post() URL and verify what you are sending. It is a website that accepts HTTP submissions – GET and POST and shows them back to you, throwing them away the next day.

To be a nice internet citizen I put a delay in between each loop. (Well, really it’s Google so they have plenty of bandwidth, but it’s good practice I think.)

Because I’m publishing the code, I also added in a (very) basic sanity check in case the user hasn’t updated the source code with their own details.

For my script, I used sys.argv to get the folder and location values passed as parameters on the command line, but you could also use input() and just prompt if you prefer. Or modify the source code each time you run it on a new PC.

To Conclude

Now, I can run this script on each computer I have movies on and I can send a pre-filled link to myself so I can just use my phone or laptop and enter all the DVDs I have or recordings on the TV PVR! Google Forms is mobile responsive so it’s easy to just stand at the bookshelf and enter them. If you have family members who just bought a bunch of new DVDs, they too can enter them just as easily.

Of course, a real database might be better since you can update and delete entries easily and run your own queries; but there are a few issues:

  • Setting up a database takes technical knowledge – this solution is usable by anyone and you can skip the Python script if you want to hand enter all of it. Though it’s an easy script to modify, some Python experience is useful…
  • You need somewhere to host it
  • You probably want a web interface too. This is even more coding to write and can get quite involved.

By storing the results into a Google spreadsheet, I can still run spreadsheet-based calculations. If I need to add further options, I believe the spreadsheet will cope – the existing responses will just be empty for the new columns.

As you can see, you could use this same technique for lots of different media – books, CDs, video games. I’ll just populate the Google Play movies in my collection by hand, but if you had many movies there, you could even scrape your own Google Play Movies – My Movies screen.

Posted by Anthony