What is Big-O?

Prelude

This will be the only theortical (and fuzzy) chapter in your advanced Computer Science AB course this year. Once we are through this theoretical minefield, the rest of the course will consist mostly of interesting programming challenges that you have grown accustomed to in AP CSA. But we must get through this one hurdle first ...


Big-O (also frequently written "Big O" without a hyphen or "Big-Oh") is an important theoretical (and practical) concept in Computer Science. However, like many Computer Science principles, it is borrowed from theoretical mathematics.


We discussed Big-O only in passing in your AP Computer Science A class last year. We talked about the efficiency of various sorting algorithms with this notation. Recall we said that bubble sorts are simple but require O(n^2) (pronounced "O of n-squared") processing time whereas a more efficient algorithm like QuickSort requires only O(n * log n). In this course, we will dramatically expand the applications of this useful tool.


The use of Big-O terminology is slightly different between the two disciplines of mathematics and Computer Science. Regardless, so that you have a sound, fundamental understanding of where all of these principles come from, we will discuss Big-O as it relates to both Computer Science as well as mathematics.


It is crucial that you understand this concept because it will be used repeatedly throughout this course. For example, we will often choose one data structure over another because it is more efficient - one might be O(n2) while a better choice might offer a faster algorithm that is O(n*log n).


This lesson is designed to help you better understand the Big O concept.

Big-O Notation Notes


: This stands for some unknown number of data entries used in the program. If we wanted, for example, to sort 100 numbers, we would say that n equals 100 for that particular program run. For this lesson, n is always a non-negative integer.


c, k : These letters represent constants. For example we may say that a sort algorithm we have implemented has 500 milliseconds (half a second) of overhead. That half second would be a constant regardless of the additional time required to sort n data entries. For Big O analysis, constants are always a positive number.


log n: This represents the logarithm of n. Students often ask what is the base of the log in Big O analysis. Strangely, it does not matter which base is chosen because the difference in the log of n between one base and another differ only by a constant. As we shall show on a later slide, constants do not change Big O analysis. Regardless, we typically assume base 10 or base 2 in our analysis when doing Big O.


IFF : This mathematical abbreviation stands for "if and only if."


What is Big-O?

Simply stated, Big-O is a mathematical way of describing the efficiency of an algorithm. For certain programs, we want to know how the program's run-time scales as the size of the data gets larger. Thus, Big-O is about how fast a program scales - BIG-O DOES NOT DESCRIBE HOW FAST A PROGRAM RUNS.


Take, for example, a simple method that counts the number of times a value appears in an array.

Let us define each step as one comparison. Thus, each time we compare a value in the array to the value we are searching for, we would say that another step is required for this algorithm. We want to measure run-time by calculating the number of steps required. The more steps we need to execute, the longer the program takes. This concept, along with more complex examples are featured in the videos that follow in later sections of this unit.