Though it
becomes difficult (as well as time-consuming) when you start to throw around effects that deal circumstantially-dependent amounts of damage.
For example, the total damage out put of a fireball changes depending on how many enemies you can get in its area of effect when you cast it. That, in turn, depends on the number of enemies you're fighting, the environment you're fighting them in, your ability to force or entice enemies to move where you want them, whether you're willing to catch allies in the blast (at a penalty affected by your allies' defensive abilities), etc.
Obviously, that means that fireball will be better than other attack options in some circumstances and worse in others. Choosing parameters that make it "balanced overall" involves choosing an arbitrary standard for how much and how often a specialized attack should help you, and then making a lot of incredibly complicated and subtle
estimates in order to apply that standard in a vaguely fair fashion not just to fireball but also to chain lightning, poison cloud, and all your other attacks that deal different total amounts of damage depending on the circumstances.
As an example, while designing
Descent: The Enduring Evil, I made a complicated spreadsheet for comparing hero attacks to monsters resilience as a balancing aid. "Normal" attacks used exact calculations to determine the average number of swings needed to bring down a monster. But in order to support other weapons, I had to plug in algorithms for estimating the utility of various forms of area damage, damage-over-time, and other effects. I think I did pretty well, but having played the finished product for a while, I also think that weapons with certain abilities ended up systematically over- or under-powered because my estimate for how often their ability would apply was off.
Of course, that's kind of a good thing. If a game is
entirely straightforward to balance, that usually means the optimal strategy is obvious, which isn't so good for gameplay.
Also, side note, if an encounter is supposed to have enough damage output to take out an
individual PC before the combat ends, then you need to take attrition into account when setting damage benchmarks.