Review: CPSC 320, 404

Posted on June 23, 2015 in ubc • 8 min read

I took CPSC 320 and 404 this year (in 2014W2) along with a few electives, right after my second 8-month co-op stint. I actually took a lighter workload than I would usually take during regular winter sessions, in part because I was also doing a TA-ship for the first time (in fact, I've gone ahead and wrote a review about my TA experience). Keep on reading for my thoughts about these two upper-level CPSC courses.

CPSC 320:

In a nutshell:

Language: none in particular

Toolset: again, none in particular

Prereqs: CPSC 221, and either (a) 6 credits of 2nd year MATH or STAT or (b) 3 credits of 2nd year MATH or STAT with a grade of 72% or better. The MATH/STAT credits that you'll use to satisfy these prereqs are likely to come from a combination of MATH 200, MATH 221, and STAT 200+302 / STAT 241.

Website/resources: CPSC 320, Piazza

Textbook: Algorithm Design by Kleinberg and Tardos

Presenting CPSC 320, known as "Algorithms and Dating Advice"...according to our assignment hand-in box, at least:

img_cpsc320_assignment_box

CPSC 320, more formally known as "Intermediate Algorithm Design and Analysis", is a sequel/successor of sorts to CPSC 221; it covers a bunch of different algorithms and paradigms not covered in CPSC 221, with little overlap (you'll be given a review of asymptotic analysis and notation, e.g. big O, omega, theta, etc., and as you might expect, proofs and proof techniques are still very much relevant). In my opinion, CPSC 320 isn't nearly as broad as its predecessor, but it is still just as dense and material-heavy. The broad categories that you'll cover during the course include graphs, greedy algorithms, divide-and-conquer algorithms, recurrences, memoization, dynamic programming, and NP-completeness. I believe the course is also supposed to cover randomization and amortized analysis (judging by the syllabus provided for previous terms), but we didn't really have time to get to that this term.

I think it's helpful to point out the learning goals for the overall course (quoting from the syllabus):

  1. Recognize which algorithm design technique(s), such as divide and conquer, prune and search, greedy strategies, or dynamic programming was used in a given algorithm.
  2. Select and judge several promising paradigms and/or data structures (possibly slightly modified) for a given problem by analyzing the problem’s properties.
  3. Implement a solution to a problem using a specified algorithm design paradigm, given sufficient information about the form of that problem’s solution.
  4. Select, judge and apply promising mathematical techniques (such as asymptotic notations, recurrence relations, amortized analysis and decision trees) to establish reasonably tight upper and lower bounds on the running time of algorithms.
  5. Recognize similarities between a new problem and some of the problems they have encountered, and judge whether or not these similarities can be leveraged towards designing an algorithm for the new problem.

In essence, I feel like CPSC 221 was all about giving you the basic toolkit you needed to understand algorithms, learning about the core data structures that you'll see appear over and over again (e.g. lists, trees, graphs, etc.) and common algorithms that are applicable everywhere (e.g. sorting, hashing, etc.); CPSC 320 is about building on that knowledge and applying it to a wider variety of problems, as well as a more directed focus on finding more efficient ways of solving a given problem (or whether that's even possible to begin with). It's definitely a course worth taking if you found CPSC 221 even remotely interesting. Fair warning though, I consider this course to be the most difficult 3rd year CPSC course I've taken to date (compared to CPSC 304, 310, 313, and 317), both in terms of workload and also the difficulty of the material that's assigned and taught.

Dr. Steve Wolfman taught this course when I took it. For those who've taken previous courses with him, you should already know what to expect: highly interactive lectures, and a very hands-on approach to problem solving in-class. In fact, lectures involve very little lecturing; instead, the bulk of class time was spent working on problems with your immediate neighbours, with Steve periodically giving out hints and then solutions near the end of class. As a result, you're supposed to do all the readings outside of class and in advance to the lectures so you were able to actually work on the problems. I'm generally quite a fan of his interactive style of teaching, although I admit that I'm not very diligent when it comes to pre-reading.

Another peculiarity with courses taught by Steve is evening group midterm exams and group finals. How this works is that when you take an exam, you first take it individually, and then you take the same (or very similar) exam again as a group of 3-5 of your fellow classmates; your resulting exam mark is derived from 85% of your individual mark and 15% of the group mark (with the individual mark being worth 100% if higher than the group mark, so you'll never get penalized for doing the group exam). As a result, you get immediate feedback on how you did; for me, that often translates from a mental thought process of "I think I did well on the exam" after taking the individual portion, to "oh crap, I answered this and that and everything else wrong so I must have totally bombed that exam" after the group exam. :P

It's also worth noting that the exams were all open-book, i.e. you were allowed to bring the textbook and 3 binders worth of notes if you desired to do so. In reality, that provides little more than just a boost to your confidence; it's unlikely you'll have enough time during the exam to fully utilize your notes, and Steve's exams will test you on how capable you are at applying the theorems and algorithms you've learned, not on how well you can regurgitate material.

Assignments are quite heavily biased towards theory and applying the concepts you've learned from class and the textbook, with very little coding involved; I recall only 1 of the 7 assignments actually involved any coding. They're also considerably time-consuming, so I highly recommend finding a partner (you're allowed to work in groups of 2 max) and working on them in advance of the due date.

Grading consisted of 7 assignments worth 20%, 2 midterms worth 30% total, lecture pre-reading quizzes (of which there were about ~20) worth 1% total, a final exam worth 45%; the remaining 4% goes towards whichever component above you did best in. Unlike most other Science courses, you do not need to achieve >50% on the final to pass; instead, you had to achieve >50% on the weighted average of the midterms and the final exam (as well as the overall course itself, of course), which was a relief to me as I thought the final was considerably harder than the midterms and I was somewhat worried that I might not have passed it. It's worth noting that I'm unsure whether this applies to just Steve's CPSC 320 sections, or all sections of the course taught by other professors as well.

CPSC 404:

In a nutshell:

Language: relational algebra, SQL

Toolset: exposure to a number of RDBMSs, including IBM's DB2 and Microsoft SQL Server

Prereqs: CPSC 213 and CPSC 304

Website/resources: CPSC 404; most of the course material is on Connect, with Q&A on Piazza

Textbook: Database Management Systems, 3rd ed., by Ramakrishnan and Gehrke (same as CPSC 304!)

CPSC 404, a.k.a. "Advanced Relational Database Systems", is as you might expect the sequel of CPSC 304. 304 serves the purpose of introducing you to relational databases (which I'll abbreviate to RDBMS from now on), including some of the necessary theory you need to understand like E-R models and relational algebra, and then actually using a RDBMS (e.g. by learning SQL syntax and being able to write out SQL queries). CPSC 404, on the other hand, dives into the nitty gritty internals of a RDBMS, focusing on how RDBMSs are implemented as well as the underlying data structures used by many RDBMSs. Therefore, if you're thinking of taking CPSC 404 with the intent of becoming a better DBA or to brush up on SQL, you'll probably be disappointed. On the other hand, if you're interested in learning more about what goes on behind the scenes in a typical RDBMS, the course material should be relevant to your interests.

You'll start off by learning about learning about topics like storage/memory hierarchy, buffer pool management, and I/O costs; calculating and comparing page I/O costs will be a prevalent theme throughout the entire course, and most of the calculations you'll be doing involve calculating I/O cost one way or another. That's followed by more than a month of lectures on both tree-structured and hash-structured indexes, data structures used to represent and maintain indexes, time/space complexity of common operations (insert, update, delete) performed on these data structures, and hashing; this includes hashing methods that were previously covered in CPSC 221 like static hashing, and methods that weren't, like extendible hashing and linear hashing. You'll see B+ trees again during this unit, which is another source of overlap with CPSC 221. This is followed by external sorting, including external mergesort and 2PMMS (the motivation for this is that regular in-memory mergesort like the one you learned in CPSC 221 doesn't account for the I/O overhead in RDMBSs). Next up is coverage of query evaluation and optimization, followed by a few weeks of cramming in data warehousing. Overall, I found the material to be quite straight-forward but dry at times, with the exception of the last unit on data warehousing which I found to be very confusing (I don't think cramming in this bulky unit in the last few weeks of class helped much).

CPSC 404 has a fairly standard marking scheme, with a combination of 2 assignments (4% each), 4 in-class midterms (10% each), pre-class and in-class exercises (10% total), clickers (5%), a final exam worth 37%, and 1% bonus for filling out surveys. The two assignments involved using Microsoft SQL Server and SQL Server Management Studio and running through a very detailed set of instructions with a partner (the assignment PDFs were 40-50 pages long, most of which was along the lines of "click this button" or "open this dialog box" or "type in this query" etc.), with questions and a brief Q&A follow-up with a TA. To be honest, I didn't find the assignments all that helpful; they were quite tedious, and the data warehousing assignment in particular was just confusing. I think the pre-class and in-class exercises, which were brief and to the point, were much more enjoyable and instructional.

Dr. Ed Knorr was my instructor for this session of CPSC 404, and this was the first time I took a class taught by him. I found him to be an approachable and overall effective professor, and he's clearly very passionate in the subjects he teaches, although I feel like he has a tendency to go off on unrelated tangents during lectures sometimes. He also has a mild-mannered voice which was a contributing factor to me falling asleep in class more than once (although lack of sleep was the predominant factor, of course, and that's purely my own fault).