Hi Tim,
Keep in mind please : I can only judge through the filtering needed for the NOS1. So, this is about what we tend to call "upsampling". Upsampling by itself is a dumn and useless thing (it never sounds better), unless it is used upon a Non-Oversampling DAC because raw 16/44.1 really can't go. Ok, half a world loves NOS because of its more pure and musical sound, but from the technical point of view it measures very poor. Proper means of upsampling makes NOS the best, and the key is in that "proper". Notice this is not done anywhere for this explicit reason (the DACs don't even exist), so it will be hard to find anything about it (unless it came from my hand).
Now, the first thing a player will fail upon (mind you, regarding this upsampling aspect) is that proper filtering. It is also to keep in mind that no upsampling means from players are made for this filtering, unless it is Arc Prediction as made for the NOS1, or HQPlayer from Miska who has a kind of the same ideas as I do, with the difference of him letting it loose on Oversampling DACs in order to overrule the filtering of those DACs (an OS DAC always contains the necessary filtering itself). That this is justified for cases one has to awaiten (it completely depends on the math within the DAC whether this works or not) is prooven by Arc Prediction itself which is used by nearly everyone, even after my advice not to use it. So, it ever arose within XXHighEnd for the NOS1, but people started using it anyway, and stuck with it.
This part concluded : When upsampling is explicitly setup as a filtering means, it really can work out to be better than doing nothing and leave it to your OS DACs. But the explicitly filtering players are rare.
Because it is a gadget or something, and merely because people are asking for it, players start to contain upsampling. Look at Bitperfect for example, where the developer really didn't intend to do it, while already now he is working on it because people ask. What will come from that ? well, something which upsamples.
A very imnportant notice is the fact that nobody is even able to judge the net outcome of the filter(ing). Why ? because the OS DAC does that again (by its own means, which actually is unknown). But I can, because the NOS1 can. So, software filters can be measured for their result, but while the analogue output of the DAC can be measured for the net result as well, it all becomes moot to create a filter while not knowing what any random DAC will do with it. The result might be better, it might be worse, it may be strange. Not so with the NOS1 (and it really is the only one allowing this) because it really doesn't do a thing and it just passes on the filtering done in-software. This doesn't mean the filter can be created in theory only to next look at its result from software only, because the DAC is still there to pass all on 100% or 50% or whatever, but since in the end it is an analogue device, it will change again to some degree.
Now it becomes "dangerous", because while a normal filter will degrade steep transients to saltless sines (this is just the general naturte of a filter), a filter which sustaints the transients may let the DAC choke on that because it can't follow. Now the real merits (of a DAC) come along, because any random DAC doesn't even *need* to follow because it's fed with those saltless sines anyway.
Things now become more and more difficult, because a filter which sustains the steep transients, urges for a DAC to be able to follow. When not, sound will only get worse. Of course this is now talking the other way around, because first there is this fine DAC, and next a filter can be used that feeds it with fine data. But what I want to say and make clear is : this isn't just a matter of Audirvana being good or best or anything. It is about a best match while nothing (including Audirvana) is made for the job.
A convenient thing at choosing a proper filter, is that the NOS1 can take "infinitely all" so to speak. I mean, it has a slew rate on the output stage of 650V/us (micro second) (with only 2.25VRMS output) so it really can follow everything. That it will throws this "everything followed" at your amps is problem-next to carefully think about. Anyway, the NOS1 can be taken out of the equation looking at the behaviour of filters, and on this matter a software simulation would just do. Not that I do this, but it is a convenient thing at developing filters, knowing that they work out as simulated in software.
This part summarized (and not to forget) : The NOS1 is NOS/Filterless so the first prerequisite is fulfilled, and it is so overly fast that nothing is changed along the way. BUT, anything it is fed with which is "strange", is now followed just the same ...
Hopefully you start to see why the long story is necessary :
1. I know what test data I put in and how it should look like;
2. I can see the result of it and compare it with how it should look like (100% the same in the case of the NOS1);
3. I thus now can put that test data through any player and compare.
None of the players I judged so far come even close to what I want, but now it becomes more complex, because there is a tradeoff at play;
This is the filtering which *is* applied to all existing upsampling, just because of the way it is done. This by itself is a good thing, because it removes mirrors of the signal (the sound) beyond the audio band. Well, theoretically that is a good thing, with the notice that we officially can't hear it. Can't perceive it ? maybe, and this is a subject within itself. Anyway it doesn't belong there, because it is a false image of what plays *in* the audio band (say, 1Hz-20KHz). Arc Prediction doesn't remove that, but it sustains the transients for 100%.
The latter within itself is not explained in a small paragraph, because "100%" a. isn't in order because of upsampling and b. shouldn't even be in order because of the too low samplerate (remember, we're talking Redbook). So, things *are* flattened, but only because it is a necessary thing (remove the distortion coming from the digital stepping at the too low sample rate).
But look at this like a one sample long (short) transient from zero to full scale, which after 8x upsampling now has become 8 samples long. Still, however, in the time domain nothing changed, because where first the one sample took an amount of time (what about 1/44100 s) now that 8 samples take the same amount of time. So, still 100% good afterall, but not 100% the same as originally there.
The filter to choose will be the filter which resembles more or less that transient response we like so much. Ok, not that you will know it (although it could be from max 18 bits NOS DACs), but this really does the job were it about good sound. Ok, it is.
At last coming to the matters of your question, iTunes can't do better than making a pure (pure !) sine of 1 sample short pulses with 8 samples "no sound" in between. So, that transient going from zero to full scale (say 2V), repeatedly played with 8 samples nothing in between, turns out to be a sine going from -2V to +2V. A ratatatatat becomes a nice flute. This is what the filter made of it.
Further sticking to Audirvana only, this makes a distorted sound of it. Something like rata-fluuuut-ratata-fluuut-fluuut-ratatatat. Bad ?, well only to some degree. It sustains some transients, but at a lower frequency (the ratatat is only there once in the x samples). It now completely depends on the profound frequency in the music whether that's really audible as something we call "distortion". And as I said earlier, I can hear it here or there. But nothing I would prefer a Windows machine for, were I on a Mac and had to learn the Windows abouts (IOW, XXHighEnd still is better, and infinitely better for the theories on the transients). What it *is* about though, is the huge difference which can be achieved by being able to use this DAC in the first place. Because remember (and believe me) any other DAC will fluuuuuuuuuuuuuuuuuuuuuuuuuuut only. And hey, use iTunes and you won't know the difference anyway.
Dizzy ? probably. Anyway, the lesson I learned myself is that once there's some ratatat left in the result, it really starts to sound like how the music is to be. This "some" really is "some" and not all that much more. But when there's nothing (like iTunes) it's totally sh*t. Here too I must add somee precaution;
The fact that iTunes shows exactly nothing of any transient I put in, does not tell it does nothing all the way. I mean, suppose I wouldn't have 8 samples of silence but 24, something might still be at work, depending on the "length" of the filter. On this matter, think of filters as averaging things, and that they -briefly said- average the samples over a longer period of time in order to come to the net result of the one sample output. But it also works the other way around : if my 8 samples silence were degreased to 1, all will become a further mess by using a filter which really can't cope, and a song within a song might emerge. Notice that at one sample space the frequency of it is 22050Hz (you will understand that), so my 8 samples space is really something in the audible area. Also, if you have followed my several stories about how improvement on the NOS1 always is about On-Off sound and how foremost synth music is needed to judge that, you will now understand why. This transient stuff is about On/Off all the way, and iTunes makes flutes of everything. But so do OS DACs, or IOW iTunes can't be *that* bad.
I hope it is clear that this can't be an absolute judgement about player software. The allowed (choices of) filtering is the importance here, and it really is not a normal application.
Also, I am not finished with judging it all, plus I have not much time for it. But at least I am 100% confident that all is good to go for the NOS1, and I seriously could listen to Audirvana forever if my Windows machines broke down. I hope his tells enough.
Peter