Categories
Questions

Question #6: What the F*ck is a Transfer Function?

Whoa. There’s a term that you probably haven’t bumped into before, eh? You’ve been pushing pixels for how long, and some bozo tosses a made-up term at you? You might even start to believe that this rubbish series of posts that promised to inform has cheated you, and instead just keeps looping back to the crap you don’t know.

First up, take heed reader… we are at least identifying the things that we don’t know. As it turns out, much of the problems with understanding digital colour comes from understanding that our knowledge stands relative to something. That is, when we first removed the rusty old boat that was covering the stinky well, and asked “What the f*ck happens when we change an RGB slider value?”, we discovered that we were detached from what the value referred to. What did it control? So sit down, buckle up, and prepare to hear referred quite a bit more in the coming inquiries…


Having read through the past five questions, you are likely starting to draw a connection between the code values and some device or some model. That is, up to now, we have discussed maximum intensity versus minimum intensity, but not yet described the device they are referring to. We somewhat superficially learned that the colours of the lights in a typical display are described by the sRGB specification, but we haven’t gone much further than that.

While we are starting to define a useful model here, we still can’t drill down into the bedrock and answer hard questions like “What does a code value of 1.0 mean?” until we more deeply consider the thing we are connected to. That is, a code value as we’ve hammered repeatedly, needs decoding to give those three teeny RGB flashlights in our pixels meaning. How do we decode it though?


To give our code values meaning, we are going to hook them up to a display device. We’ll use the bog standard display device known as an sRGB display. What is an sRGB display? We again touched on a teeny component of this in Question 4, but it’s not good enough to dive deeper…

Somewhere in a galaxy far, far away around 1996, a group of folks gathered together and designed a specification that describes an imaginary device. Henceforth, vendors went about and constructed a real device from said imaginary device’s specification blueprint.

Take the device home, plug it into a computer, send some code values to it, and hopefully it will behave as that device specification outlined. Whether you realized it or not, and as we touched upon in Question #3, the display you are on has a high probability to be designed to the sRGB standard.¹

Looping back, we take that encoded value, run it through our computer and out to the display, and the display takes the code value and turns it into light. This leads us to our wonderful question number six, which is…

Question #6: What exactly is a transfer function and why is it important?


So what happens when we wrap up a code value and send it to our brand new sRGB display? If we send a code value of [1.0, 1.0, 1.0], do we have any idea how much light is coming out? That question, wise reader, is exactly what a transfer function helps to answer. It’s the peanut butter and jam between the slices of code value and decoded value bread.

So what does 1.0 spit out in terms of light?

“Maximum” you say, gleefully aware of what we’ve discovered. You’d be correct. But can we quantify how much light should come out of one of our RGB lights? The answer is yes, but we would need some metric. While there are quite a few of them out there, the one chosen by the standard is known as a candela per square metre². You may have heard it before, as some folks call it, somewhat colloquially, a nit.

If you had the sRGB specification in front of you, you’d see that ideally, in a reference display, the output from an sRGB display is 80 nits. But let’s focus on our question, which is sorting out specifically what a transfer function is…


Given we know that we have a code value of 1.0, and knowing that we are sending that code value to our display, we can also have faith that the display is decoding that code value according to the specification. You’ve probably already guessed that if our maximum code value is 1.0, and that our maximum idealized nit output of our sRGB display is 80 nits… that yes… we’d get out 80 nits worth of intensity if all three channels were set to maximum! Whatever a nit is…

This is great. Off to digital painting you go, freshly armed with your new knowledge that there is an actual way to calculate light output from your strange code value. But uh… let’s take a moment to ponder other values. What about 0.5? Seems innocuous enough right? I mean it’s 50%… of our sRGB display’s pixel channel intensity… erm… is that… physical energy as in 50% of our maximum 80 nits… or is it uh… something… else?

And now for a preemptive nuclear strike on the WANKS³…

“I ALREADY KNOW THIS SH*T!!!11!! IT’S GAMMA!!1!!” I hate to break it to you my dear Sealioning WANK, but that’s only going to lead you down the path of pain if you discard what you can learn from transfer functions under an umbrella of “gamma”⁴. For everyone else, and maybe even one of the WANKs who has crossed over to being a normal human being… onto that answer…

Answer #6: A transfer function is a math formula that describes how a code value relates to light energy input or output.


Wow if that isn’t too vague, eh?

See how I said “math formula” and not some perceptual mumbo jumbo? Note how I also didn’t say the godforsaken dumpster fire of a term “gamma”? There’s good reason for this. For now, let’s realize we’ve learned a new term, and we will use it henceforth. Rejoice, and say the words colour component transfer function or again, simply transfer function, with me.

I gave you a loose term, that isn’t entirely clear, that describes output and input, yet we are talking about displays. I apologize. I did that in the interest of the future, however! “But what about the sixteen times you’ve asked what a code value is and then ask another damn question? What about 0.5 dammit? EXPLAIN SOMETHING, WILL YOU?!!?”

We are getting there, but sadly, we also have to keep these posts brief in this Attention Economy, so go grab a cup of covfefe and relax for a bit until that next question…


¹ If you happen to be on a MacBook Pro from 2016 onwards, an iMac with a Retina 5k display from the latter half of 2015 onwards, an iMac Pro, or any of the many wider gamut Eizos and other displays, your lights are actually a different colour. For now, we will ignore this facet and try to bolt down the foundational concepts.

² Sorry Team USA, the rest of the world is on the metric system. No, really. Here’s a useful map of all of the folks, shown here in red, using that other system…

³ See footnote 2 in the first question.

⁴ If you had the sRGB specification sitting in front of you, you’d open it up to plenty of math and strange terms. Those math and strange terms though, help to describe how the device in question operates. On some level, that includes some absolute metrics and models, one of which we learned above, known as a candela per square meter, or nit. It also describes two formulas for getting to and from sRGB code values. Those functions are typically called transfer functions.

“Aha!” you might say, “That’s GAMMA!!!!111!!!” Well I have some bad news for the you. The actual standard has a little note in it. The little note reads:

1) The term “gamma” is not used in this standard for the reasons discussed in annex A.

If we flip to annex A, we find this extremely wise bit of text:

Annex A
(informative)

Ambiguity in the definition of the term “gamma”

Historically, both the photographic and television industries claim integral use of the term “gamma” for different effects. Hurter and Driffield first used the term in the 1890s in describing the straight-line portion of the density versus log exposure curves that describe photographic sensitometry. The photographic sensitometry field has used several interrelated terms to describe similar effects, including gamma, slope, gradient, and contrast. Both Languimier in the 1910s and Oliver in the 1940s defined “gamma” for the television industry (and thus the computer graphics industry) as the exponential value in both simple and complex power functions that describe the relationship between gun voltage and intensity (or luminance). In fact, even within the television industry, there are multiple, conflicting definitions of “gamma”. These include differences in describing physical aspects (such as gun “gamma” and phosphor “gamma”). These also include differences in equations for the same physical aspect (there are currently at least three commonly used equations in the computer graphics industry to describe the relationship between gun voltage and intensity, all of which provide significantly different results). After significant insightful feedback from many industries, this standard has explicitly chosen to avoid the use of the term “gamma”. Furthermore, it appears that the usefulness of the term in unambiguous, constructive standard terminology is zero and its continued use is detrimental to consistent cross-reference between standards and unambiguous communication.

That’s right folks… the sRGB standard, way, way, way back in 1999, chose to drop the awful term “gamma”, for good reason. It’s a piece of junk term that is sufficiently overloaded to mean close to nothing when we are trying to discuss digital colour on a technical level such as these postings. Worse, when you utter it as an incantation, all sorts of drooling meatheads come out of the woodwork and start sealioning bullshit at you. You’d be wise to skip the term, and stick to the one outlined in the many specifications that define the very things digital artists use on a daily basis. To the Sealions? F*ck off.

Friends don’t let friends say “gamma.”

10 replies on “Question #6: What the F*ck is a Transfer Function?”

Hello again! I have kept reading some other sources and there are some concepts that are often loosely mentioned, and I can’t completely wrap my head around them. Therefore, I come back here to ask yet another question.
Say we are recording with a sony camera and S-Gamut3 primaries; If I have understood properly, the camera will apply a S-Log3 function (for example) to the radiometric light values. To work “scene referred”, the moment we input this footage into our desired software we would apply the inverse function to the S-Log3. But if for some reason the VFX was working with bt.709 primaries we would still need to perform a color transform operation to swift the primaries from the S-Gamut3 ones using a 3×3 matrix. I understand the same happens when reproducing for different displays, sets of 3×3 matrices would be used respectively to adequate to each of them.
If I am not terribly wrong with what I have just summarized, what happens when we input a footage in our post production/compositing software of choice? I apologize in advance for bringing a particular case but I think the Foundry’s pages are quite the example. As it is now obvious, I normally work with Nuke and in most of their documentation they always refer to their working color space as “linear” which out of the box, isn’t but an amazing way to make stuff unclear for those trying to learn digital color. Only here (https://support.foundry.com/hc/en-us/articles/115000229784) they seem to briefly acknowledge the contradiction that is using “linear” as loosely as they do:

However, Nuke’s “colorspace” isn’t a standard colourspace…
…a colourspace is a set of parameters which define the colour capabilities of a particular device or stored in a digital file, generally outlined by a set of three colour primaries, a white point and transfer function(s).

Nuke’s internal color management does not define primaries, white point and transfer function(s), instead they are driven by a transfer function from one “colourspace” to another “colourspace”. This works because of the Grassman’s Law principle of additive colour mixture, more information about which can be found here:

https://www5.in.tum.de/lehre/vorlesungen/graphik/info/csc/COL_11.htm

What this means is that the primaries don’t matter for any of the operations Nuke does, provided they are consistent. Switching between “colorspaces” involves a linear operation with a matrix, which through a linear transfer function preserves the linearity required when modelling the accumulation or attenuation of light.
This also allows users to mix a variety of images from various colourspaces as they would be only having linear based colour transforms applied to them.
Nuke’s working space is locked to linear in order for it to be able to obey Grassman’s Laws.

If I understand what I read, it means that they are indeed performing a 3×3 matrix Color Transform every time some data enters the software: from one “colorspace” (I presume the footage’s) to another “colorspace”??? They are talking about a destination colorspace but at the same time saying that Nuke doesn’t have a standard colorspace because it doesn’t define primaries. So what is the destination colorspace and what is Nuke using that matrix they talk about for???
I am clearly not understanding something here, but I think I wouldn’t be the only one somwhat lost after reading this page, which looks like is using a “because Grassman said so” as a way of magically explaining everything.

I guess my question really is: how does a Color Transform 3×3 Matrices relate to Transfer functions? Are they 2 completely separate operations, or are in some cases 3×3 matrix considered Transfer Functions?

Thanks for everything Troy!

Like

> the camera will apply a S-Log3 function (for example) to the radiometric light values.

I would caution against the term “radiometric”. Once the sensor has filtered the radiometric values according to a loosely-observer based filtration, the radiometry is gone. It’s a one-way process. At the point when we have the quantal catches, they are simply quantal catches, awaiting fitting. We could argue that the values are already partially-photometric based on the qualities of the filtration and an attempt to loosely match the Standard Observer colourimetry and balance against noise.

> To work “scene referred”, the moment we input this footage into our desired software we would apply the inverse function to the S-Log3.

I’ve since become extremely hesitant to ever say “scene” here, largely because once we are in a tristimulus system, it is 100% photometric tristimulus. We can use tristimulus as a proxy, but it’s a quasi-neurophysical model.

> If I am not terribly wrong with what I have just summarized, what happens when we input a footage in our post production/compositing software of choice?

The problem becomes one of meaning. A 3×3 is an affine transform, that will hold parallel lines as being parallel effectively in the projected resultant domain. However, we will end up with non-information if we intend to perform work on the values. Specifically, in the open domain tristimulus, some values will be less than zero! While this works fine for specifying a tristimulus coordinate, for actual work, it becomes hugely problematic.

Case in point… let’s consider the simple case of BT.709. A triplet might be `0.3, 0.1, 0.3`, and we’d calculate the luminance by taking the luminance of each channel, and summing it. That is, `(R * R_weight) + (G * G_weight) + (B * B_weight)`. Luminance is used here to give us a general idea as to loosely how “bright” the pixel will be. It’s not ideal, but it is a reasonable approximation for our purposes.

Now consider if our origin encoding had tristimulus colourimetry that cannot be expressed in BT.709, even in the open domain of zero to infinity. We might have `0.3, -0.1, 0.3`. In this case, the “green” channel is actually specifying a luminance that is lower in the resulting triplet. Specifically, it indicates *negative* luminance. Let’s run the math:

The origin luminance of `0.3, -0.1, 0.3` is `(0.3 * 0.2126) + (-0.1 * 0.7152) + (0.0722 * 0.3)` which is `0.01392`
Now let’s clip that small negative off. `0.3, 0.0, 0.3` is `(0.3 * 0.2126) + (0.0 * 0.7152) + (0.0722 * 0.3)` which is a luminance value of `0.08544`.

As you can see, that is a *tremendous* problem, and all because of non-information relative to the target destination domain. We more than quadrupled the luminance of that spatial sample!

> they seem to briefly acknowledge the contradiction that is using “linear” as loosely as they do:

Many folks do. “Linear” is as bad as “Gamma” these days. I prefer to ask the question “Linear with respect to what?” or better “Uniform with respect to what?”

In the case of tristimulus? It’s *always* uniform. Open domain tristimulus negative infinity to positive? Still tristimulus! Closed domain zero to one hundred percent? Still tristimulus! It is important to not conflate “Encoded for signal transmission” with the ridiculously overloaded term “linear”. Under this lens, any log-like transfer function takes the information to an encoded state that *is not tristimulus*, and as such, the idea of calling it “nonlinear” is somewhat goofy. It might be wiser to simply consider the signal as an encoded signal relative to the optical tristimulus, as opposed to using the umbrella overloaded terms “linear” and “non-linear”. I wish I had a better solution…

> This works because of the Grassman’s Law principle of additive colour mixture, more information about which can be found here

This is the basis of all tristimulus colourimetry, which is subject for debate. There is no debate that the CIE 1924-1931 Standard Observer Illuminant E model is entirely based on this Grassman additivity law, and as such, affine transformations via matrices “work” by gaining the quasi-neurophysical tristimulus amplitudes.

> This also allows users to mix a variety of images from various colourspaces as they would be only having linear based colour transforms applied to them.

Careful. I’d be extremely cautious thinking this is the case, but perhaps this is the subject of another discussion.

Colour, as you can probably guess, is an entirely complex neurophysical bit of magic. That is, it doesn’t exist outside us. We use the quasi-neurophysical model in an attempt to approximate a stimulus-like specification as to how these quasi-neurophysical signals would interact.

To suggest that there is an absolute ground truth in colour is *extremely* foolish. Hopefully that recent chapter on “Perceptual Colour”, which is ultimately the only “real” facet of colour, showcases how mystical and slippery our embodied cognitive states construct and fabricate colour.

I’d be extremely reluctant to suggest that a hack, based on tristimulus, would yield what is stated in the above quote. It is wise to remember that this approach is ultimately a hack.

> They are talking about a destination colorspace but at the same time saying that Nuke doesn’t have a standard colorspace because it doesn’t define primaries. So what is the destination colorspace and what is Nuke using that matrix they talk about for???

Most pipelines that use Nuke lean hard into software like OpenColorIO or bespoke matrices and such to keep a tristimulus uniform “working space” behind the scenes. This is *not* the ideal working space for all operations, contrary to much conventional wisdom. Reconstructive upsampling being a classic case where uniform tristimulus sampling will yield incorrect results in the “upscaled” buffer. There are many others.

> I guess my question really is: how does a Color Transform 3×3 Matrices relate to Transfer functions? Are they 2 completely separate operations, or are in some cases 3×3 matrix considered Transfer Functions?

The general idea is that an affine 3×3 matrix transform *must* be applied on uniform tristimulus. Given that the signal encoded into most cinema cameras is a log-like transfer function, that must first be “undone” to derive the uniform tristimulus colourimetry. If we stack everything together, we end up with something like:
1. Log-like camera encoding. LogC-v3, SLog2, etc. are all examples, and would all fall under the general term of a Transfer Function. Before we can apply a matrix, we would need to decode the encoding via the inverse of the camera log-like transfer function encoding to get to…
2. Uniform tristimulus in some camera encoding specified coordinate system. Arri Wide Gamut v3, S-Gamut v2, etc. are all examples. We would apply a 3×3 matrix to get from camera uniform RGB to…
3. CIE XYZ based tristimulus. At this point the achromatic centroid is fixed. We’d want to shift the achromatic axis to align it to our working space, which is typically another 3×3 chain. Once complete, we will likely want to get from the CIE XYZ based colourimetry, using a 3×3 matrix, to…
4. Uniform tristimulus in some open domain colourspace representation. Some folks might use BT.709 based primaries. Others might use the camera native colourspace such as Arri Wide Gamut v3 etc. There is much nuance here, and the results are different based on which working space we use.

After 4., Nuke would be used to apply various manipulations. Nuke assumes an open domain representation in the working space, negative infinity to positive infinity, so care must be taken by the authors to avoid the sorts of problems such as the negative non-information signal prior to working on the spatial pixel samples. From 4. onward, the image would be formed, and we would have to go through a similar series of transforms to generate encodings for specific display mediums.

But *that* is another discussion altogether. Hopefully you can see how a full transformation chain will often involve transfer functions *and* matrices, or even more peculiar transforms!

Hope this helps!

Like

It helps very much indeed.

> Most pipelines that use Nuke lean hard into software like OpenColorIO or bespoke matrices and such to keep a tristimulus uniform “working space” behind the scenes. This is *not* the ideal working space for all operations, contrary to much conventional wisdom. Reconstructive upsampling being a classic case where uniform tristimulus sampling will yield incorrect results in the “upscaled” buffer. There are many others.

But even if Nuke is keeping a tristimulus uniform “working space” by default as the image data is read, it only applies the inverse Log it considers appropriate and keeps the primaries of the input untouched right? I looked at the Colorspace node and observed that the default values are “Linear-D65-sRGB” and just by changing the first field wass getting similar results to modifying the the Colorspace dropdown menu of both Write and Read nodes. Furthermore the names inside that dropdown were of Transfer Functions, so I assume that by default the only thing that is going on in the reading/writing process is just the inverse Log. In this case then, we would need to perform the series of 3×3 matrices you explained to modify appropriately the primaries.

>3. CIE XYZ based tristimulus. At this point the achromatic centroid is fixed.

Wouldn’t the achromatic centroid be also fixed in the Uniform tristimulus in the camera encoding specified system in 0.3127, 0.3290 (D65) (In the case of Sony S-Gamut/S-Gamut3 if I am not mistaken) that we already have in point 2.? At this point I am not entirely understanding this:

> shift the achromatic axis to align our working space, which is typically another 3×3 chain.

Isn’t there a 3×3 that gets you from the Uniform tristimulus native to the camera to the one that we want to work with? And if we were okay with working in the native space of the camera, couldn’t we stop the process after applying the inverse Log? Since we are already working with uniform tristimulus and the primaries we want.
If I am following, I understand that the grading process and the “look” modifications would be performed after 4. Right?

Thanks again for your amazing reply and your patience.

Like

> it only applies the inverse Log it considers appropriate and keeps the primaries of the input untouched right?

Not via OpenColorIO or a bespoke chain. A matrix will be applied as well, in an attempt to “align” the tristimulus.

Sadly that is usually the extent, and that alone is insufficient due to domain mismatches.

> Furthermore the names inside that dropdown were of Transfer Functions, so I assume that by default the only thing that is going on in the reading/writing process is just the inverse Log. In this case then, we would need to perform the series of 3×3 matrices you explained to modify appropriately the primaries.

100% correct. Nuke assumes a BT.709 open domain ground truth. Many other pipelines do not however.

> Wouldn’t the achromatic centroid be also fixed in the Uniform tristimulus in the camera encoding specified system in 0.3127, 0.3290 (D65) (In the case of Sony S-Gamut/S-Gamut3 if I am not mistaken) that we already have in point 2.? At this point I am not entirely understanding this

*If* the achromatic axis is the same from the camera to the working projection, then further matrix distortions are not required.

> And if we were okay with working in the native space of the camera, couldn’t we stop the process after applying the inverse Log? Since we are already working with uniform tristimulus and the primaries we want.
If I am following, I understand that the grading process and the “look” modifications would be performed after 4. Right?

Correct.

But remember that any assets or picked tristimulus values would all need proper handling to intermix with the camera observer tristimulus.

Like

Hello again Troy! I feel like this thread is exceeding the #6 topic but this is where these questions and replies have taken me and I feel it doesn’t make a lot of sense to reply elsewhere.
After these questions I found myself digging in several ACES central discussions that you were part of, and especially in the Output Transforms one.
Some real nice definitions and information were shared in there like that Sean Cooper’s intervention regarding “Tone Mapping” which for this post I will take as a reference (https://community.acescentral.com/t/output-transform-tone-scale/3498/51). Getting after the point 4 from those you listed in one of your previous replies, and after our grade/LMT which would be the point 5, again, If I understood properly, point 6 would be what seems to be the Tartarus gates: Output Transforms (where I understand Tone Mapping occurs).
Taking Jed’s OT description (https://community.acescentral.com/t/output-transform-tone-scale/3498/60), plus the color appearance step that Mansencal mentions as a reply to that same post and that you start covering in the question #27, it looks like a pretty complete potential OT and point 6.

With this new information I jumped into Nuke again and started testing and playing with a lot of the files, configs, scripts, and equations provided in those discussions and I found myself understanding considerably more “stuff” about color than a few weeks ago. But again, more questions. In this case I was working with your Filmic-blender and upon reading the documentation I saw that your config indeed performs a tone map. Knowing that the Transfer Function of Base Log Encoding is a combination of the 3D lut for gamut compression and the final Base Log Encoding covers the log 2 I guessed that the first one is the file in the “lut” folder: (desat65cube.spi3d). Therefore, I thought that the Tone Map has to be performed in the “look transforms”. Am I wrong? They are all 1d luts but what is the math behind them? Is it related to the famous filmic S shaped curve? In the thread I saw many different equations proposed for the Tone Map step.
Kind of understanding most of the above steps I tried to dive into the popular Arri-K1S1 Display Transform in the .nk file that again Jed provides in here: https://community.acescentral.com/t/per-channel-display-transform-with-wider-rendering-gamut/3768. But a few variations in it turned into yet again more questions. First of all, this workflow applies the Tone Map after an inverse log which means that it is being applied to a non-uniform tristimulus. This made me think of if it was one of these operations that you said benefit from working non-linear?:

> Most pipelines that use Nuke lean hard into software like OpenColorIO or bespoke matrices and such to keep a tristimulus uniform “working space” behind the scenes. This is not the ideal working space for all operations, contrary to much conventional wisdom.

I see then that a “DeGamma” and a “ReGamma” are performed which I guess, partly because of their values and partly due to the operation in between, that it is nothing more than a log to get to back to a uniform tristimulus to apply the 3×3 desat matrix and the 2.4 gamma EOTF for bt rec709. But maybe I am guessing too much.

Thank you again for your replies and help and sorry for the extension of the post.

Like

> But again, more questions. In this case I was working with your Filmic-blender and upon reading the documentation I saw that your config indeed performs a tone map.

I have updated the readme at GitHub to reflect the usage of language and concepts I would change if I were rewriting things today. It might be of interest to you.

> Therefore, I thought that the Tone Map has to be performed in the “look transforms”. Am I wrong?

I am not entirely sure what the specific question is here.

I currently *strongly* believe that the general notions that I parroted, in an effort to re-use some of the language out there, is dreadfully wrong.

I see a *clear* division between open domain tristimulus from a render, or a “decompressed” version from a camera, as **not a picture**. This is a longer discussion that I would like to put here at some point, but have yet to.

Specifically, I do not believe open domain tristimulus to be a formed image, and as such, we can derive a simple analogy to creative chemical film; a manipulation “in front of” the camera, pre picture formation in the creative film, and a manipulation of the picture formed, post picture formation.

OpenColorIO sadly has engineered the entire chain around the pre-formation stage, and as such, some creative opportunities are impossible. I’ve demoed this with the AgX proof of principle, which effectively means some creative adjustments can only be achieved by bundling the transform into a “view”, to use OpenColorIO parlance.

> They are all 1d luts but what is the math behind them? Is it related to the famous filmic S shaped curve? In the thread I saw many different equations proposed for the Tone Map step.

With creative film, the curve relates to the rate of change of the densities of the subtractive process. This brings the majority of the picture formation mechanics into view. In additive systems, the picture formation is wildly different in many ways, because a single channel of tristimulus always remains fixed in a chromaticity domain projection, and varies only in intensity.

The *mechanic* of how this works, in conjunction with the channel by channel independent lookup nature, is what makes the picture; the “origin” tristimulus is completely discarded, and a completely new tristimulus coordinate is generated.

> But a few variations in it turned into yet again more questions.

Follow them! We need *more* people following them, not fewer!

> First of all, this workflow applies the Tone Map after an inverse log which means that it is being applied to a non-uniform tristimulus.

The distortions are indeed important to generating the picture, which in my estimation, is perhaps one of the most overlooked concepts in contemporary image authorship; the “origin” colourimetry from a camera *never is displayed*. And this is a **good** thing, because way down deep in this rabbit hole, one will find all of the research and thought that *a picture is not a “representation” of the colorimetry measured from a point in space*. It cannot be, and should not be, and *must* not be. But that is another discussion entirely…

I believe if you follow the Harald post that is out there, you will see that he takes the picture tristimulus, which are *uniform* (linear) with respect to the picture, but *non-uniform* with respect to the origin tristimulus (non-linear), he treats them “as encoded for a display medium”. That is, he “undoes” the display medium encoding to get to the uniform picture tristimulus, and applies a reprojectjon of the tristimulus and a scaling, which amounts to a slight reduction in purity of the result.

> Thank you again for your replies and help and sorry for the extension of the post.

On the contrary… seeing someone become aware of things and chase concepts based on their own understanding, and make inferences, is the most rewarding thing of all of this.

Keep at it!

Like

It has been very enlightening indeed, though of course, such enlightenment hasn’t come without more questions.

I get that there is something wrong with how “RGB observer based colourimetry/tristimulus model” and “electromagnetic radiation quantities” have their vocabulary mixed when they are “irreconcilable” domains, thus being the source of plenty of misunderstandings. One example I you provide which I find amazing is the “luminance” vs “dynamic range”. I imagine you mean *absolute luminance (L)*? Then again this would connect with Tone mapping, that if I have understood properly is the process in which we map the *absolute Luminance (L)* to the *relative Luminance*(Y) (relative to the chosen Display):

> “…the extent to which it is possible by the photographic process to produce a pictorial representation of an object which will, when viewed, excite in the mind of the observer the same subjective impression as that produced by the image formed on the the retina when the object its self is observed…” L.A. Jones
> He then goes on to say “The proper reproduction of brightness and brightness differences…is of preeminent importance…”. Brightness does have a definition in the color science community, to quote Fairchild “…visual sensation according to which an area appears to emit more or less light”. Jones utilises an explicit simplifying assumption, which I believe is largely implicit in most discussions of “Tone Curves” or “Tone Mapping Operators”.

Extracts from Sean Cooper’s post in ACES forum: https://community.acescentral.com/t/output-transform-tone-scale/3498/51

> Specifically, I do not believe open domain tristimulus to be a formed image, and as such, we can derive a simple analogy to creative chemical film; a manipulation “in front of” the camera, pre picture formation in the creative film, and a manipulation of the picture formed, post picture formation.
OpenColorIO sadly has engineered the entire chain around the pre-formation stage, and as such, some creative opportunities are impossible. I’ve demoed this with the AgX proof of principle, which effectively means some creative adjustments can only be achieved by bundling the transform into a “view”, to use OpenColorIO parlance.

Does the “pre picture formation” stage relate to the “electromagnetic radiation quantities” and the “post picture formation” relate to the “RGB observer based colourimetry/tristimulus model”?

> I’ve demoed this with the AgX proof of principle, which effectively means some creative adjustments can only be achieved by bundling the transform into a “view”, to use OpenColorIO parlance.

With “creative adjustments” do you mean the Tone Map for example?Which operations should happen in the “pre picture formation” stage and which ones in the “post picture formation” stage?

> Therefore, I thought that the Tone Map has to be performed in the “look transforms”. Am I wrong?
> I am not entirely sure what the specific question is here.

What I meant by this was trying to understand where the Tone Map fits in the Filmic workflow. I will come back to this in a bit.

> I believe if you follow the Harald post that is out there, you will see that he takes the picture tristimulus, which are *uniform* (linear) with respect to the picture, but *non-uniform* with respect to the origin tristimulus (non-linear), he treats them “as encoded for a display medium”. That is, he “undoes” the display medium encoding to get to the uniform picture tristimulus, and applies a reprojectjon of the tristimulus and a scaling, which amounts to a slight reduction in purity of the result.

> “The conversion steps are
Log C -> Tone Map -> Display Linearization -> RGB Matrix conversion -> Display Encoding
Hence, when your display is a standard video monitor (ITU Rec 1886)
Log C -> Tone Map -> x^2.4 -> RGB Matrix conversion -> x^(1/2.4)”

Based on the post Harald is answering, the input picture is non-uniform and we get something like this:
linear source image -> Alexa logC curve -> Log C (undoing) -> Tone Map -> x^2.4 -> RGB Matrix conversion -> x^(1/2.4)

So the Tone Map is applied to uniform tristimulus. It is this step the one I wish to get a better comprehension of and compare to the Filmic config. I understand that this step in Filmic is achieved by “bundling the transform into a “view”” (that lives inside the “looks” folder). The operations contained in those “views” are applied to data that is uniform tristimulus; then performing the same steps as in Harald scheme, right?
Do these “views” would be considered as “Output Transforms” (as discussed in here: https://community.acescentral.com/t/output-transform-tone-scale/3498)?
Up until here I think I comprehend most of it, which makes me quite happy.

Regarding the gamut mapping extra explanation, you have in GitHub, I am not sure what you mean by “chromatic attenuation” and “crosstalk” and I haven’t been successful finding information on the topic except for some engineering definitions that I couldn’t make a lot of use of. Where could I get a bit of literature on this topic?

Thanks Troy!

Like

I realize now that this reply got reposted. I had an issue with the account and couldn’t check if it had gone through. Sorry for the duplicate!

Like

Fixed! I am in the process of answering most of the questions as a post. I hope you don’t mind. I’d prefer to credit your full name, but I also understand if you prefer anonymity.

Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s