When I first learned integrations by substitution, I saw \(\frac{\mathrm{d}u}{\mathrm{d}x}\) get rearranged to \(\mathrm{d}u=⋯\mathrm{d}x\). My first reaction when I saw it was

WTF?????

How can you just do that???

\(\mathrm{d}x\) basically specifies the variable to change in differentiations and integrations. It reads in English “with respect to x”, which is kind of like an operator. Putting it by itself without context makes no sense; it’s like writing \(\div 3=x\), or saying “with respect to \(x\) equals to …”. It’s literally an imcomplete sentence! What with respect to \(x\)? But it somehow works. How???

So I took some time to go back to the meaning of the notations, the root of these rules and finally figured out (I think). Disclaimer: my interpretation is just one of many, perhaps a really basic one, or straight up wrong. I couldn’t find anything online to verify or disprove my interpretation, so please point out any mistake I made.

What does \(\frac{\mathrm{d}}{\mathrm{d}x}\) meaan? (And is it a fraction?)

Let’s start with the basic definition of differentiations, explained by limits.

$$\displaystyle \frac{\mathrm{d}}{\mathrm{d}x} f(x)= \lim_{k \to 0} ⁡\frac{f(x+k)-f(x)}{k}$$

It is important to note that \(x\) doesn’t have to be \(x\), it can be any other variable, or even a function like

$$\displaystyle \frac{\mathrm{d}}{\mathrm{d}g(x)}f(g(x))=\lim_{k \to 0} \frac{f(g(x)+k)-f(g(x))}{k}$$

Also \(\frac{\mathrm{d}}{\mathrm{d}x}\) is not a fraction; it is simply just an operator, like \(\div x\), so we can’t say \(\frac{\mathrm{d}}{\mathrm{d}x}f(x)=\frac{\mathrm{d}f(x)}{\mathrm{d}x}\)(at least not without the special trick we will see later).

The chain rule and its influence on notations

A very important part of calculus is the chain rule. Using the notations we just established, the chain rule is as follows:

$$\displaystyle \frac{\mathrm{d}}{\mathrm{d}x} f(g(x))=\frac{\mathrm{d}}{\mathrm{d}x} g(x) \cdot \frac{\mathrm{d}}{\mathrm{d}g(x)} f(g(x))$$

The proof of chain rule is quite long and complicated, so we won’t delve into that. Later when we introduce the new notations, even though the rule will look legitimate with “cancelling”, remember it looks that way only because we choose to fit the notations to the rule.

Now Leibniz probably noticed that chain rule behaves awfully similar to fractions if it was written in a certain way. That is if we let loose of our loyalty to the actual meaning of things a bit and allow us to treat \(\frac{\mathrm{d}}{\mathrm{d}x}\) as a fraction (even though it is not):

$$\displaystyle \frac{\mathrm{d}f(g(x))}{\mathrm{d}x} =\frac{\mathrm{d}g(x)}{\mathrm{d}x} \cdot \frac{\mathrm{d}f(g(x))}{\mathrm{d}g(x)}$$

or more concisely

$$\displaystyle \frac{\mathrm{d}z}{\mathrm{d}x}=\frac{\mathrm{d}y}{\mathrm{d}x}\cdot\frac{\mathrm{d}z}{\mathrm{d}y}$$

The \(\mathrm{d}y\) on the right side nicely “cancels out” like a fraction. But in the midst of beautiful notations, we must remember that nothing actually “cancels out”, the only thing that is happening is the chain rule, and we (or Leibniz) chose to write it in a way so that it creates the illusion that something “cancels out” like a fraction.

Notations and integration by parts

Integration notion is designed to maintain the illusion. Notice that

$$\displaystyle \int \frac{\mathrm{d}y}{\mathrm{d}x}\mathrm{d}x = y + C$$

It looks like the \(\mathrm{d}x\) cancelled out when in fact it is just how we defined the integral.

The extended illusion allows us to use some tricks when we solve integration by substitution problems. Integration by substitution is:

$$\displaystyle \int z(y) \cdot \frac{\mathrm{d}y}{\mathrm{d}x} \; \mathrm{d}x=\int z(y) \; \mathrm{d}y$$

Again, even though it looks like \(\mathrm{d}x\) cancelled out like a fraction, in reality only the chain rule is used. These nicely sustained illusions allow us to use notational tricks when we are solving problems. For example, suppose we want to solve for the integral on the left side, we can

$$\displaystyle \begin{aligned} u &= y \\ \frac{\mathrm{d}}{\mathrm{d}x} u &= \frac{\mathrm{d}}{\mathrm{d}x} y \; \text{(perfectly legitimate and meaningful)} \\ \frac{\mathrm{d}u}{\mathrm{d}x} &= \frac{\mathrm{d}y}{\mathrm{d}x} \; \text{(beginning of notational tricks)} \\ \mathrm{d}u &= \frac{\mathrm{d}u}{\mathrm{d}x} \mathrm{d}x \; \text{(meaningless)}\\ \int z(y) \cdot \frac{\mathrm{d}y}{\mathrm{d}x} \; \mathrm{d} x &=\int z(u) \; \mathrm{d}u \; \text{(meaningful and true)} \end{aligned}$$

I want to emphasize we can do these only because the notations are well designed such that we can enter a meaningless notational world and exit it correctly. Using these tricks require caution. There are some obvious mistakes. For example,

$$\displaystyle \frac{d}{dx}x^{2} = \frac{dx}{dx}x = x$$

The appearance that the things looks like fractions and behave sometimes like fractions (and sometimes not) is a constant source of confusion (as least for me).

To illustrate the point that the seemingly sensible and self-evident tricks are just an illusion of a well-designed notational system, I am going to design two alternative notational system, in one of which the tricks would simply not work, in the other the tricks will lead to false results.

Notational system to which no trick can be applied

Write differentiation as \(cx \vert f(x)\), which means \(\frac{\mathrm{d}}{\mathrm{d}x} f(x)\) in the normal system.

Now the chain rule writes:

$$\displaystyle cx \vert z = cx \vert y \cdot cy \vert z$$

Write integration as

$$\displaystyle \int cx \vert z(x)$$
which means
$$\displaystyle \int z(x) \; \mathrm{d}x$$
in the normal system.

So now integration by substitution writes as

$$\displaystyle \int cx \vert \left(z(x) \cdot cx \vert y \right) = \int cy \vert z(y)$$

As you see, nothing “cancels out” like the way our current system does, and we can’t write \(cu=⋯cx\) and substitute it in.

Notational system in which using trick will lead to false result

Write differentiation as \(\frac{ox}{of(x)}\). It means the same as \(\frac{\mathrm{d}}{\mathrm{d}x} f(x)\) in the normal system. It is very similar to the normal one, except that everything is upside down.

Now the chain rule writes:

$$\displaystyle \frac{ox}{oz} = \frac{ox}{oy} \cdot \frac{oy}{oz}$$

Things seem to “cancel out” quite nicely.

$$\displaystyle \int\frac{ox}{oy}\;ox=y+C$$

So now integration by substitution writes as

$$\displaystyle \int z(y) \cdot \frac{ox}{oy}\;ox = \int z(y) \; oy$$

Now if we use the same tricks

$$\displaystyle \begin{aligned} u &= y \\ \frac{ox}{ou} &= \frac{ox}{oy} \\ oy &= \frac{ou}{ox}\;ox \\ \int z(y) \cdot oy & = \int z(u)\frac{ou}{ox}\;ox\; \left(\text{which is} \int z(u) \cdot \frac{\mathrm{d}x}{\mathrm{d}u}\;\mathrm{d}x \; \text{in normal notations}\right) \end{aligned}$$

Which is clearly false.

As you can see, much of the integration techniques we use has no meaning but instead are just notational tricks. That’s why it can be confusing at times and lead to mistakes if one is not careful. Thank you for making it this far. Hopefully that was clear and interesting.

CC0
To the extent possible under law, Deyao Chen has waived all copyright and related or neighboring rights to What is du?. This work is published from: Hong Kong.