Uncategorized

Improving code with RegEx

Background

Because of the way the prerequisites were entered into the course websites, I needed to write a little function parse strings such as “INFS1200 + GEOM1100 + 1200 + 1300″. In English, we know that 1200 and 1300 refer to GEOM1200 and GEOM1300 but that was assumed so left off. However, I needed to create a List object with the course codes themselves. So after I use list.split() on the string, I iterated over the list and the letters of the previous element to any current element that consists only of digits.

My little function was fine while I could list = string.split(” + “), but this breaks down when you discover that sometimes they are entered as “INFS1200 or GEOM1100 and 1200 & 1300” or something equally inconsistent.

Improvements

So it was time to learn RegEx! And therefore re-write the function in a more generic way. I needed to find any string that consisted of 4 uppercase letters followed by 4 digits or 2 uppercase letters followed by 4 digits, putting each result into the list.

After some searching and experimentation, I found that (simple) RegEx wasn’t as difficult as I thought and kind of fun. I used the following:

[A-Z]{4}[0-9]{4} | [A-Z]{2}[0-9]{4} | [0-9]{4}r

This works well and means I don’t have to worry how they separate the course codes. After adding the course codes to the list, I then iterate over it to fix the missing subject matter letters (INFS or GEOM etc).

Posted by Anthony

Introducing the projects!

I’ve set up this place to document my university projects and personal project histories, learnings and progress. I’m going to start with pages for the university projects I’ve done in 2016 and continue with ongoing technology projects I’m working on.

In my spare time, I am also working on a personal database and Python project to help students choose subjects at UQ.

Posted by Anthony, 1 comment