# An Intuitive Introduction to Calculus

In this post, we dial back a bit on the mathematical sophistication and take on a subject that should be familiar to most people in STEM (Science, Technology, Engineering and Mathematics) fields but largely possessing a scary reputation to those with less mathematical training – calculus. For that matter, there are even those who can wield calculus as a tool for producing numbers, but with little appreciation for the ideas at work behind it.

This post will be dependent largely on intuition as opposed to rigor; references will be provided at the end of the post for those who want a look at the inner workings behind calculus. However, we will not shy away from notation either, in the hopes that people who find the symbols intimidating will instead appreciate them as helpful tools expressing elegant ideas.

##### I. The Derivative

We will begin with a somewhat whimsical example. Consider a house, and suppose in this house the living room is twice as large (in area for example) as the dining room. Now suppose we hit the entire house with a “shrinking ray”, shrinking it down so that it can no longer be seen with the naked eye nor measured with our usual measuring instruments. So if someone asks for the size of the house now, we might as well say that it’s negligible. What about the size of the living room? Negligible as well. The size of the dining room? Also negligible.

However, as long as this “shrinking ray” works the way it does in the movies, we can ask how big the living room is compared to the dining room, and the answer is the same as it was before. It is twice as big, even though both are now of essentially negligible size.

It is this idea that lies at the heart of calculus; an expression built out of negligible quantities may turn out to not be negligible at all. In our example, the particular expression in question is a ratio of two negligible quantities. This will lead us to the notion of a derivative.

But first, we will consider another example, which at first glance will seem unrelated to our idea of negligible quantities. Suppose that we want to find out about the speed of a certain car as it travels a distance of 40 kilometers from City A to City B. Suppose the car has no speedometer, or that the speedometer is busted.  The only way we can figure out the speed is from its definition as the ratio of the distance traveled to the time of travel. Suppose the journey from City A to City B takes place in an hour. This would mean that the car is travelling at a rather leisurely (or slow) speed of 40 kilometers per hour.

But suppose we provide additional information about the trip. Let’s say that the first half of the distance between City A and City B took up 45 minutes, due to heavy traffic. Therefore the second half of the distance was traversed in only fifteen minutes. Then we can say that the car was actually travelling pretty slowly at around 27 kilometers per hour for the first half of the distance. For the second half of the distance, the car was actually travelling pretty fast at 80 kilometers per hour.

Let’s provide even more information about the trip. Let’s say that the last quarter of the distance took up ten minutes of the trip. So the car was traveling slower again, at 60 kilometers per hour. This means the third quarter of the distance, a distance of 10 kilometers, was traversed in only 5 minutes. In other words, the car was travelling quite fast at 120 kilometers per hour.

Although the car is travelling at a rather slow speed on the average, there is a part of the trip where the car travels pretty fast. We only learn about this if we look at parts of the trip instead of just averaging on the whole. And we learn more if we look at smaller and smaller parts – perhaps we can learn most if we look at parts so small, that they might as well be negligible. And so we make contact, once again, with the core idea that we are trying to discuss.

As may be familiar from high school physics, the average speed (we symbolize it here by $v$) is often written as the ratio between differences in the distances (symbolized here by $x$) and the times (symbolized here by $t$). In mathematics, a quantity which is a difference of two other quantities is often written using the symbol $\Delta$. Therefore, the formula for the velocity may be written as follows:

$\displaystyle v=\frac{\Delta x}{\Delta t}$

However, as we have demonstrated in our example above, we learn more by dividing the quantities into smaller and smaller parts.

When our quantities, in this particular example the differences of distances and times, are so small that they might as well be negligible, instead of using the symbols $\Delta x$ and $\Delta t$ we instead write $dx$ and $dt$. Therefore we write

$\displaystyle v=\frac{dx}{dt}$

We review some of the concepts we have discussed. What is the value of $dx$? It’s essentially negligible; it’s not quite zero, but very close to it, that we can’t really say anything about it anymore. What about $dt$? Again, essentially negligible. But what about $v$? Despite it being a ratio of two “essentially negligible” quantities, $v$ itself is not negligible!

As demonstrated earlier in our example, $v$ will be different depending on which part of the journey we are considering. To make this more explicit, we can write

$\displaystyle v=\frac{dx}{dt}|_{t=t_{0}}$

or

$\displaystyle v=\frac{dx}{dt} (t_{0})$

to mean that we mean $v$ at the specific instant of time $t_{0}$. Because we are looking at specific instants of time, we refer to this speed as the instantaneous speed. If we had a speedometer, we could simply take a reading from it at the specific instant of time $t_{0}$, and the reading would be the instantaneous speed. However, we assumed that we could not do this, therefore to figure out the instantaneous speed we need to take extremely small measurements of distance and extremely small measurements of time, somewhere around the  specific instant of time $t_{0}$, and take their ratio.

The problem, of course, is that the quantities are so small that they are essentially negligible. We may not be able to measure them, since they may be so much smaller than the limits of accuracy of our measuring instruments. So how could we get such a ratio, which is not negligible, from two essentially negligible quantites that we cannot even measure? Luckily, if we can express one quantity as a function of the other, we can have a way of calculating this ratio. We discuss this method next.

In mathematics, we often use the concept of functions to express how one quantity depends on another. In this case, given a particular form of a function $f(x)$, we can obtain $\frac{df}{dx}$ as another function of $x$. We need our function $f(x)$ to be “continuous” so that we always know that $df$ is extremely small whenever $dx$ is extremely small.

Consider the quantity

$\displaystyle \frac{f(x+\epsilon)-f(x)}{(x+\epsilon)-(x)}$

This is a ratio of differences; the numerator is a difference, and so is the denominator; therefore we can also write this as

$\displaystyle \frac{\Delta f}{\Delta x}$

Now suppose the quantity $\epsilon$ is extremely small. In this case, the denominator may be rewritten; instead of writing $\Delta x$, we write $dx$, since the difference between $x+\epsilon$ and $x$ is extremely small (in fact it is just $\epsilon$). As for the numerator, if the function is continuous as described above, then we know automatically that it is also extremely small, and we may therefore also write $df$ instead of $\Delta f$. Therefore, we have

$\displaystyle \frac{df}{dx}=\frac{f(x+\epsilon)-f(x)}{(x+\epsilon)-(x)}$ when $\epsilon$ is extremely small (essentially negligible)

Let’s see an explicit example of this in action. Suppose we have $f(x)=x^{2}$. Then we have

$\displaystyle \frac{\Delta f}{\Delta x}=\frac{(x+\epsilon)^{2}-x^{2}}{(x+\epsilon)-(x)}$

Using some high school algebra, we can expand and simplify the right hand side:

$\displaystyle \frac{\Delta f}{\Delta x}=\frac{x^{2}+2x\epsilon+\epsilon^{2}-x^{2}}{x+\epsilon-x}$

$\displaystyle \frac{\Delta f}{\Delta x}=\frac{2x\epsilon+\epsilon^{2}}{\epsilon}$

$\displaystyle \frac{\Delta f}{\Delta x}=\frac{\epsilon(2x+\epsilon)}{\epsilon}$

$\displaystyle \frac{\Delta f}{\Delta x}=\frac{(2x+\epsilon)}{1}$

$\displaystyle \frac{\Delta f}{\Delta x}=2x+\epsilon$

Now to obtain $\frac{df}{dx}$, we just need $\epsilon$ to be extremely small, essentially negligible; in any case, it should be much smaller than any other term in the expression that we might as well chalk it up to measurement error, like the difference between the weight of a sack of flour and the weight of a sack of flour with one extra grain of flour. In other words, $2x+\epsilon$ and $2x$ are essentially the same; and $\epsilon$ is so small that we might as well set it to zero. Therefore we have

$\displaystyle \frac{df}{dx}=2x$

Note that $df$ by itself is essentially negligible; in this case it is equal to $2x\epsilon+\epsilon^{2}$, and since $\epsilon$ is extremely small, the entire expression itself is also extremely small and essentially negligible. Of course, $dx$ is just $\epsilon$, and is also extremely small and essentially negligible. But in accordance with our “important idea” stated earlier, the ratio $\frac{df}{dx}$, called the derivative of $f$ with respect to $x$, is not neglible. The derivative of  $f$ with respect to $x$ is also often written $f'(x)$.

So going back to our example of the car going from City A to City B, if we could have an expression that gives us the distance traveled by the car in terms of its time of travel, we could calculate the instantaneous speed at any time by taking the derivative of that expression.

If a certain quantity depends on several other quantities, for example, if it is a function $f(x,y,z)$ of several independent variables $x$, $y$, and $z$, the derivative of $f$ with respect to any one of the independent variables, suppose $x$, is called the partial derivative of $f$ with respect to $x$, and written $\frac{\partial f}{\partial x}$, or sometimes $\partial_{x} f$.

##### II. The Integral

We next discuss another expression, aside from a ratio, that is made out of essentially negligible quantities but is not itself negligible. Consider the weight of a grain of flour; like stated in an earlier example, we often think of it as essentially negligible. The weight of a sack of flour, on the other hand, is certainly not often thought of as negligible. But the sack of flour itself is made up of grains of flour; therefore these “essentially” negligible things, when there are many enough of them, may combine into something that is not itself negligible.

We consider a summation of a certain number of terms, and we also consider an interval of real numbers from the real number $a$ to the real number $b$. If we will sum two terms then we will divide this interval into two, if we will sum three terms we will divide this interval into three, and so on. Consider now the summation of five terms

$f(b)(b-x_{4})+f(x_{4})(x_{4}-x_{3})+f(x_{3})(x_{3}-x_{2})+f(x_{2})(x_{2}-x_{1})+f(x_{1})(x_{1}-a)$.

where $f(x)$ is a function of real numbers, defined on the interval from $a$ to $b$ and $x_{1}$, $x_{2}$, $x_{3}$, and$x_{4}$ are quantities between $a$ and $b$ dividing the interval from $a$ to $b$ into five equal parts. If we have, for example $a=0$ and $b=100$, then $x_{1}=20$$x_{2}=40$$x_{3}=60$, and $x_{4}=80$.

For further simplification we can also write $x_{5}=b$ and $x_{0}=a$. We can then write the same sum as

$f(x_{5})(x_{5}-x_{4})+f(x_{4})(x_{4}-x_{3})+f(x_{3})(x_{3}-x_{2})+f(x_{2})(x_{2}-x_{1})+f(x_{1})(x_{1}-x_{0})$.

This can be expressed in more compact notation as

$\displaystyle \sum_{i=1}^{5} f(x_{i})(x_{i}-x_{i-1})$

or, to keep it general, we divide the interval between $a$ and $b$ into $n$ subdivisions where $n$ is any positive integer instead of just five, so that we have

$\displaystyle \sum_{i=1}^{n} f(x_{i})(x_{i}-x_{i-1})$

Note that as the number $n$ of subdivisions of the interval between $a$ and $b$ increases, the quantity $(x_{i}-x_{i-1})$ decreases. We consider another sum, namely

$\displaystyle \sum_{i=1}^{n} f(x_{i-1})(x_{i}-x_{i-1})$

This is different from the other sum above. However, we note that as we increase the number of subdivisions, the quantity $(x_{i}-x_{i-1})$ decreases, and the quantities $x_{i}$ and  $x_{i-1}$ become essentially equal to each other. If our function $f(x)$ is “continuous”, then $f(x_{i})$ and $f(x_{i-1})$ become essentially equal to each other too. When we reach so many subdivisions that the quantity $(x_{i}-x_{i-1})$ becomes extremely small or essentially negligible, and the quantities $f(x_{i})$ and $f(x_{i-1})$ become essentially equal to each other, we write

$\displaystyle \int_{a}^{b}f(x)dx=\sum_{i=1}^{n} f(x_{i})(x_{i}-x_{i-1})=\sum_{i=1}^{n} f(x_{i-1})(x_{i}-x_{i-1})$

The summation $\int_{a}^{b}f(x)dx$ is called the integral of $f(x)$ from $a$ to $b$.

The integral is a sum of terms of the form $f(x)dx$, which is the product of $f(x)$ multiplied by $dx$. In fact, the integral symbol itself $\int$, is simply an old version of the letter “s”, to show that it stands for sum, in the same way that the letter “d” in $dx$ represents a very small difference.

If we think of $dx$ as a very small “width” and $f(x)$ as some sort of “height”, then $f(x)dx$ is some kind of very small “area” of a very “thin” rectangle, and the integral $\int_{a}^{b}f(x)dx$ gives the total area under the curve $f(x)$ from $a$ to $b$, taken by dividing this area into very “thin” rectangles and adding up the areas of each of these rectangles.

But there are also other quantities of the form $f(x)dx$. For example, if we think of $f(x)dx$ as the probability that a certain quantity, say, the height of a random person one may meet on the street, has a value very close to $x$, then the integral $\int_{a}^{b}f(x)dx$ gives us the total probability that this quantity has a value between $a$ and $b$.

##### III. The Fundamental Theorem of Calculus

Unlike the derivative, the integral is actually rather difficult to compute. It is a sum of very many terms, each of which is a product of one quantity with an essentially negligible quantity. However, there is a shortcut to computing the integral, which involves the derivative, and the discovery of this “shortcut” is one of the greatest achievements in the history of mathematics.

This “shortcut” which relates the integral and the derivative is so important and so monumental that it is called the fundamental theorem of calculus, and its statement is as follows:

$\displaystyle \int_{a}^{b}\frac{df}{dx}dx=f(b)-f(a)$

The notation itself is already very suggestive as to the intuition between this theorem. It is also perhaps worth noting that the integral is a sum of products, while the derivative is a ratio of differences. The function $f(x)$ is defined in the interval from $a$ to $b$, so if we sum all the tiny differences $df$ from $f(a)$ to $f(b)$ we would get $f(b)-f(a)$. Of course the rigorous proof of this theorem is much more involved than this “plausibility argument” that we have presented here.

In practice what this means if we want to solve for the integral of a function we only need to “reverse” what we did to solve for the derivative. This is why the integral (more precisely the “indefinite” integral) is also sometimes called the antiderivative. Earlier we solved for the derivative of the function $f(x)=x^{2}$ with respect to $x$ and obtained $\frac{df}{dx}=2x$. If we now ask what is the integral of $2x$ from $a$ to $b$, we will find, using the fundamental theorem of calculus, that $\int_{a}^{b}2xdx=b^{2}-a^{2}$.

We now summarize, in an effort to show that there’s nothing to be scared of when it comes to derivatives and integrals:

$\displaystyle \frac{df}{dx}$ is a ratio of extremely small differences $\displaystyle df$ and $\displaystyle dx$

$\displaystyle \int_{a}^{b}f(x)dx$ is a sum of the products of $\displaystyle f(x)$ and the extremely small differences $\displaystyle dx$

That being said, despite the rather simple intuition we have laid out here, making the language of calculus “precise”, or rather “rigorous”, is a much bigger task that has taken centuries of development, and even after mathematicians have agreed on how to develop calculus it has been “resurrected” time and time again to come up with even more powerful versions of calculus. For example, there is a version of the integral called the Lebesgue integral, in which instead of $dx$ we use a quantity called the “measure” which may not necessarily be extremely small. Still, this integral is still a sum of products, and the concept of extremely small and essentially negligible quantities still shows up, since there will be parts where the measure will become extremely small and essentially negligible.

As for dealing with the extremely small and essentially negligible, this is made more precise using the language of limits. In older times, another concept was used, called infinitesimals, however there were so many questions from philosophers that the concept was basically abandoned in favor of the language of limits. The concept of infinitesimals itself is the subject of much research in modern times, since it was found that along with modern developments in mathematics they can still become useful and now made more precise. The modern study of calculus and its related subjects goes by the much simpler-sounding name of analysis.

Finally, even though we attempted an answer to the question “What is calculus?” we did not really explain how to do calculus or how to apply it, although there are tons and tons of applications of calculus in the modern world.For all these we will direct the reader to the references at the end of the post, which will help those who want to learn more about the subject.

In my opinion however, the best way to learn calculus is to master first the language of limits, or more generally learn how to deal with extremely small and essentially negligible quantities, without necessarily going to derivatives and integrals, since derivatives and integrals themselves are merely specific applications of the philosophy, once more recalled here, that an expression built out of negligible quantities may turn out to not be negligible at all. And it is best to learn how to deal with this in the most general case. In some schools this makes up the subject of precalculus. I would recommend very much the classic book of one of the greatest mathematicians of all time, Leonhard Euler, entitled Introduction to Analysis of the Infinite, of which translations abound on the internet.

References:

Calculus on Wikipedia

Introduction to Analysis of the Infinite by Leonhard Euler (translated by Ian Bruce)

Calculus by James Stewart

Principles of Mathematical Analysis by Walter Rudin