Polyphony: Snow vs TI? [Archive] - The Unofficial Access Virus & Virus TI Forum

MBTC

29.05.2014, 04:59 AM

Polyphony/efficiency of VA synths is always an interesting topic to me. As Timo pointed out, techniques like hypersaw on the Virus, supersaw on the JP8000, or "detune density" (seems oddly named among the three) on the Ultranova can be thought of as techniques / cheats to approach the same type of sound with less processor consumption (allowing more voices to be played at once), whereas I think unison is more straightforward in describing what it does (on synths I'm aware of, at least).

I remember researching this deeply once upon a time, and I believe consensus was that the biggest difference in something like a supersaw of 7 oscillators versus 7 real oscillators that were slightly detuned would be that the CPU/DSP is effectively removing some of the harmonics that would otherwise be emanating from individual oscillators (harmonics that the naked ear might not hear anyway), reducing CPU load and thus achieving a fat sound with less processing required, the thinking being that some timbres would effectively cancel each other out or muddy the waters anyway, so there should be an algorithm that can take the signal from a single osc, multiply and shift it somewhat, then remove that which would not really be heard (by most, under most conditions at least) anyway.

That said, I have no insight as to how the algorithms are actually implemented and I'm sure it varies between synths. However many synths have recognized or emulated the non-linear nature of the detune curve of the JP8000, which results in some interesting characteristics of the final output signal, so many of them strive to implement the same kind of signal (at times paying some level of respect to the JP8000 in how the signal reacts to modulation or knob tweaking, or perhaps putting their own spin on things).

At the end of it all, the nova (the only VA HW synth I have to compare at the moment) behaves much the same way as the virus. If you use all actual oscillators + unison it puts a high load on the DSP, which will requires it to steal notes. If you use alternative methods to fatten each osc, you can get way more simultaneous notes out of it. The end result might not be exactly the same sound, but it's also unlikely that you'd notice the difference in most mixes -- at least you can always use things like compression and EQ to squeeze out timbres you feel are missing.

TweakHead

29.05.2014, 06:33 AM

The main difference, as far as sound design is concerned, is that with a super saw like oscillator, you're generating the voices at the oscillator level, while unison duplicates the entire signal, filters and everything included - so it kind of replicates the whole signal, as you would with a multi-patch with the same single replicated with a maximum of 16 parts (on the Virus, no wonder). It sounds different, and for much classic super saw sounds the oscillator simply nukes unison out of the water. For a warm analogue-like kind of pad, for example, then unison would be better, not wacky out of this world settings, just a couple of voices slightly detuned would do the job nice and easy.

The Virus has an almost perfect emulation of the JP 8000/8080 oscillator, sounding almost identical to its older counterpart - there's videos online with such comparisons. Maybe it will benefit from a slight EQ (internal one would do fine), the broad stroke high shelf making the highs a bit more "in your face" - at least that used to be the case up until the Virus C.

MBTC

29.05.2014, 03:47 PM

The main difference, as far as sound design is concerned, is that with a super saw like oscillator, you're generating the voices at the oscillator level, while unison duplicates the entire signal

I know you might be speaking about the Virus specifically but wanted to add the actual implementation of unison can vary quite a bit across different synthesizers. DUNE2 for example (DUNE is acronym for differential unison engine) treats each unison voice separately with different sound parameters for each (or you can treat all voices as one if you want). This is one of the things I love about high-end soft synths, because a typical Core i7 or whatever is so much more powerful than typical hardware DSPs, they plugin can give you the freedom to really go nuts and push limits even if it ends up using up 50% of your CPU -- that kind of CPU usage is really limiting but you of course can freeze it or sample it, which is still better than stolen notes IMO.

I believe in the case of the Ultranova, unison is basically giving you a full oscillator with each voice (a complete representation of the osc signal or waveform), but with detune/detune density it's programmatically making "lesser copies" of the signal which are actually thinner / lower fidelity versions of the signal (which, when stacked together of course give a similar effect). Think along the lines of something like synths that give you quality options (eco/draft mode for the sound) to help manage CPU resources. For example your ear can hear the difference between eco mode and best quality mode on most synths, but in a mix and after run through other effects it would be harder to notice on most sounds in most music types. The Ultranova printed manual actually makes a specific point that most of the factory patches use detune density (hypersaw trick) instead of true unison.

MBTC

29.05.2014, 04:03 PM

@Timo (since you're in the envious position of owning 2 Viruses):

Can the limitations talked about here be overcome by layering multiple Viruses in some way? By that I mean is it possible to create a patch on one of them, dump it over to the other one and run a slightly modified version on the second synth? For example a supersaw lead might normally have some of the voices playing down one octave lower, so for example the indigo could play some voices at high octaves and the snow fill in the lower octaves? Overall polyphony could be increased (not to mention thickness of sound) because less unison voices per synth could be used.

It's a multi-faceted question of course, the basis of which rests on patch compatibility between the B/C series and the TI, I'm guessing, but at the end of it I'm just wondering if polyphony challenges can be addressed with multiple viruses (not in the sense that they truly talk to each other or sync up in the true spirit of daisy chaining or anything).

TweakHead

29.05.2014, 06:18 PM

yeah, was talking about the Virus alone. Dune is one of the best implementations of Unison to my mind as well, was so with version one - that I know, haven't tried the second yet, but seems like great beast to tame.

Have you got it?

@Timo

expecting you inject some enthusiasm back in here ;) certainly hope so!

MBTC

29.05.2014, 07:20 PM

yeah, was talking about the Virus alone. Dune is one of the best implementations of Unison to my mind as well, was so with version one - that I know, haven't tried the second yet, but seems like great beast to tame.

Have you got it?

Yes, and it is amazing, perhaps my new favorite instrument. Only an $80 upgrade if you owned Dune1. But this is much more than just a new version, it blows Dune1 away.

Spreader

31.05.2014, 07:11 PM

Thanks for the answers guys. They were more interesting than I had anticipated and made me ponder what constitutes a good unison in the first place.

So let's do some quick math. Let's say the longest note that is going to get played is 1 second long which nets us frequency resolution of 1hz. Our detune amount is 40 cents for that trance 2.0 supersaw.

The first obvious question is, how many saw waves do we want? For maximum tweakability there should one saw wave for each frequency bin. At 16khz that's around 400hz. So the answer is 400 saw waves.

But this becomes immediately problematic because at lower frequencies such as 100hz the 20 cents is only 10hz. So the tone is not going to be consistent there. Further problem is that in higher frequencies the harmonics will overlap when playing low notes, so they are not going to be consistent either.

Seems like this is in fact a really tricky, possibly unsolvable problem... Oh well. Just use your ears then.

My ears tell me that Supersaws where the saws overlap sound terrible, metallic and so on. In other words, when using massive amounts of saw waves to get those nice smooth highs, the most important thing in the implementation is very high frequency resolution. For example in the prior scenario the resolution needs to be 20/400 = 0.05 cents. No digital synth can possibly pull that off...

Does anyone know what the resolution of the virus TI is? 1 cent?

Anyway, the resolution problem maybe responsible for more terrible metallic sounding supersaws than can be calculated here. Or perhaps it's just the way that some synths tend to detune the saws on top of each other...

Timo

31.05.2014, 08:11 PM

@Timo

expecting you inject some enthusiasm back in here ;) certainly hope so!

Got lots to add, and will address points made. Been away again with choppy wifi for the last 9 days, really frustrating. Will be back home tomorrow.

Was putting both viruses through their paces and doing tests with the Hypersaw before I went away. It does indeed work differently to normal patch unison and depending on the application one does work better than the other and equally importantly vice versa. They are different tools and both have strengths and weaknesses which the other addresses, so it's nice to have both. However the classic sound engine holds its own well with a few tricks.

More tomorrow. :)

Timo

31.05.2014, 08:41 PM

So let's do some quick math. Let's say the longest note that is going to get played is 1 second long which nets us frequency resolution of 1hz.

Just briefly... might just be me, but I've serious problems understanding your post! :) Guess I'll start from the top, why is the duration of a note relevant?

Spreader

31.05.2014, 09:00 PM

Just briefly... might just be me, but I've serious problems understanding your post! :) Guess I'll start from the top, why is the duration of a note relevant?

Because it tells us the frequency resolution of the signal. Sample length is just a other name for frequency resolution. The frequency resolution also means how closely the FFT bins are located on a spectrum analyzer (that's why it depends on the selected FFT size = sample size). Most FFT analyzers don't actually draw points though - just like waveform analyzers don't. But only points exist, no straight lines..

Very easy way to think about it, if you construct any signal out of sine waves, you can not have any waves in it that aren't complete sine waves (by definition). The lowest complete sine wave in a 1 second signal is obviously one second long (1hz) then the next one is half a second (2hz) and so on. So there is a limited frequency resolution that depends on the length of the signal.

I arbitrarily picked 1 second because it's hard to point out to a supersaw sound that is played for a longer duration than that.

TweakHead

01.06.2014, 11:58 PM

You're confusing a lot of stuff. You're confusing everything, really.

Example: if you were to combine sine waves, you could produce nearly every timbre imaginable by simply combining them and changing their relative volumes. Actually, that's the theory behind Additive Synthesis.

With a standard (cd format) sample rate of 44.1 Khz, you have 44100 points within a second. A simple sine wave is made of several of these points, and they're not containers for wave cycles, they're just the audio equivalent of pixels if you will, the distance of them being so small that we humans can't pick the space between them, so it feels like a continuous stream instead of a vibrant fast tremolo of sorts - if you will, which is ultimately what it is.

FFT is a math algorithm to transform time and waveform into frequency, to help analyse a signal. The blocks you get to choose don't actually change anything relative to the audio, just the under-the-hood calculations involved in producing the analysis. You're not supposed to see points on a spectral analyser at all, you could expect to see them with an oscilloscope or an audio editor whose resolution allows you to zoom in to the actual points (and yes, they do exist for more then a decade now).

In case you're wondering, humans range of listening is said to be somewhere round 20Hz-20K, so what's the point of even mentioning a 1Hz sine wave? A second is nothing more then the scale we use to count cycles, that's all. How many times has it cycled within a second? That's the definition of Hertz.

How long you sustain a note bares absolutely no connection with any of the above. If I hold a 30Hz sine wave or a 200Hz one for one second, the only thing that's changing is that there's 30 cycles on the first example and 200 on the second.

If you're using detune, then it's natural - even desirable - that the harmonics of the various sawtooth waves get mashed together. That was intentional btw, since the whole point of doing this is to produce a much richer timbre. If you pick two sawtooth oscillators and detune the second one by a few cents, if there's no phase reset on note press and the oscillators are not hard synced, then you'll have beating, that's the perceivable difference between the cycling of the two waveforms - and in musical terms, it just sounds warmer and fatter. Same thing with a handful of saw waves on top of each other, thus the options you get: you get to detune them and pan them, producing larger then life sounds while you're at it.

I hardly doubt you could identify a difference of 1 cent on a blind test. most people would be delighted to pick semitones, that's even called perfect pitch ears - go figure!

Spreader

02.06.2014, 04:25 AM

You're confusing a lot of stuff. You're confusing everything, really.

Not really...

Example: if you were to combine sine waves, you could produce nearly every timbre imaginable by simply combining them and changing their relative volumes. Actually, that's the theory behind Additive Synthesis.

Exactly, that is the whole point of my analysis. But you are incorrect with the term "nearly". Any sample imaginable can be produced by combining sine waves, and that is exactly what the FFT does. Sine waves are synonymous with frequencies, as long as they are the length of the selected sample.

With a standard (cd format) sample rate of 44.1 Khz, you have 44100 points within a second. A simple sine wave is made of several of these points, and they're not containers for wave cycles, they're just the audio equivalent of pixels if you will, the distance of them being so small that we humans can't pick the space between them, so it feels like a continuous stream instead of a vibrant fast tremolo of sorts - if you will, which is ultimately what it is.

I didn't say the samples were containers for wave cycles - although if you think about it they actually are containers for the sinc function once put through the converters. This (the time domain) actually is a continuous wave in analog domain, not collection of fast points creating some illusion of continuous wave.

Anyway the sample points ARE containers for sine waves when the sample is converted into frequency domain via FFT, or other algo. The spectrum can't be continuous if the sample is not infinitely long, even in the analog domain. This is rather obvious if you read my last post.

FFT is a math algorithm to transform time and waveform into frequency, to help analyse a signal. The blocks you get to choose don't actually change anything relative to the audio, just the under-the-hood calculations involved in producing the analysis. You're not supposed to see points on a spectral analyser at all, you could expect to see them with an oscilloscope or an audio editor whose resolution allows you to zoom in to the actual points (and yes, they do exist for more then a decade now).

I don't think I said that the block size that is chosen affects the audio, since FFT represents the signal exactly, that is absurd. The fact however is that it will affect the results of the FFT and you can't have, for example a 1.5hz frequency on a 1 second signal (that would be FFT size of 48000 with 48khz sample rate). The frequency domain can't be continuous unless the time of the sample is infinite, and this has nothing to do with digital resolution.

In case you're wondering, humans range of listening is said to be somewhere round 20Hz-20K, so what's the point of even mentioning a 1Hz sine wave? A second is nothing more then the scale we use to count cycles, that's all. How many times has it cycled within a second? That's the definition of Hertz.

The point is that the 1hz is the frequency resolution, and from that a case can be made on how many saw waves are optimal. I don't understand where you are going with this...

How long you sustain a note bares absolutely no connection with any of the above. If I hold a 30Hz sine wave or a 200Hz one for one second, the only thing that's changing is that there's 30 cycles on the first example and 200 on the second.

Did I somewhere imply the above is not the case? You are wrong that this has nothing to do with what I said. It has everything to do with synthesizing waves which require long time (high frequency resolution) such as supersaws. By the way, try holding a 30.5hz sine wave for one second - you will win a nobel if you pull that off.

If you're using detune, then it's natural - even desirable - that the harmonics of the various sawtooth waves get mashed together. That was intentional btw, since the whole point of doing this is to produce a much richer timbre. If you pick two sawtooth oscillators and detune the second one by a few cents, if there's no phase reset on note press and the oscillators are not hard synced, then you'll have beating, that's the perceivable difference between the cycling of the two waveforms - and in musical terms, it just sounds warmer and fatter. Same thing with a handful of saw waves on top of each other, thus the options you get: you get to detune them and pan them, producing larger then life sounds while you're at it.

I think you completely missed my point. By mashing harmonics together I meant that they are located on top of each other, meaning that when the phase is random they are going to vary in amplitude and sometimes disappear. Just try playing two hypersaws or whatever in unison with exactly the same detune parameters - it will sound subjectively bad - this is what I am talking about.

I hardly doubt you could identify a difference of 1 cent on a blind test. most people would be delighted to pick semitones, that's even called perfect pitch ears - go figure!

You are correct, but even a monkey could tell the difference when there are other waves involved, try detuning an osc against a other by 1 cent. We are not talking about an isolated wave here.

TweakHead

02.06.2014, 07:42 AM

https://meocloud.pt/link/97c60ca7-aeac-42d6-a9b1-6f44841ad10c/sinetest2.wav/

TweakHead

02.06.2014, 08:02 AM

TweakHead

02.06.2014, 08:27 AM

I may have come across as being excessively rude with you. But there's no nice way of saying this: you've overloaded your mind with theory that you don't fully understand and you've managed to loose perspective while at it.

Sample rate is to audio what pixels are to image. Think of it this way: there's a threshold to human perception that kind of defines how far we need to go in terms of resolution. That's how we arrived at the standards. That's why some recent Apple equipment is claiming to have hit that threshold with their "retina" monitors - they're claiming that the human eye can't possibly perceive a point with such resolutions.

With audio it's not much different. A point is just a point. It only carries volume information and phase position (time). That's it. Even if you were to create a sine wave with the same exact frequency of your sample rate, that cycle would fall between the points and there's nothing there to hold the information. So if you were to double the sample rate and keep that frequency, then you'd have something akin to an extreme bit crushed sine wave, which would look more like a square wave due to technical limitations.

But one of them is band limits. You can't even produce such wave on the digital realm (and you'd have a hard time producing it anywhere else...). And if you could, it wouldn't even matter, at least for humans, due to the limits of our perception as I explained above.

But I still fail to grasp why you'd think of all this things when confronted with a fairly simple sound design endeavour such as a super saw kind of oscillator. I think they came to be with the Roland JP series (not sure, but think that's correct) and it was precisely the detuned mess (if you will) that gave this sound its famous thickness and the power to cut through any mix you through it at. Right? So if you were to not use detune, the saws would just be tuned together and hard synced, so only volume would rise, the moment you start detuning them, you'll get a much richer tone, filled with beating - again due to different phases, due to different wave cycle lengths - and some phase cancelation as well.

Again, I fail to see where the problem is with any of this.

Spreader

02.06.2014, 01:37 PM

That's 30 entire cycles and a half. What else would you expect? Besides an audible audio click 'cause the audio cuts on a non zero cross point, right?

So I guess I'm entitled for the Nobel Prize now.

You are indeed confusing a lot of stuff m8. 30.5Hz means exactly that, just that and nothing but that.

This post is completely wrong. A 30.5hz sine wave in a 1 second window is an impossibility because it can't be a complete sine wave, it is cut off. This means it is composed of many different sine waves. Read about spectral leakage.

Further, ALL sine waves are going to have a click at the start, because it is actually impossible for us to hear a real sine wave, since our hearing does not function like a FFT.

Spreader

02.06.2014, 02:01 PM

I may have come across as being excessively rude with you. But there's no nice way of saying this: you've overloaded your mind with theory that you don't fully understand and you've managed to loose perspective while at it.

Hey, no problem, I don't have any problems with criticism. That's what figuring out the best configuration is all about. Also, I would like to point out that nowhere did I say that my approach to this is the only usable one.

Sample rate is to audio what pixels are to image. Think of it this way: there's a threshold to human perception that kind of defines how far we need to go in terms of resolution. That's how we arrived at the standards. That's why some recent Apple equipment is claiming to have hit that threshold with their "retina" monitors - they're claiming that the human eye can't possibly perceive a point with such resolutions.

No, the sample rates defines the audio bandwidth and how much aliasing there is going to be, I am not sure if that's what you are trying to say. However, it sure sounds like you actually think that those points are what is coming out of the converters. Picture reproduction works very differently than audio. Our eyes work completely differently than our ears.

With audio it's not much different. A point is just a point. It only carries volume information and phase position (time). That's it. Even if you were to create a sine wave with the same exact frequency of your sample rate, that cycle would fall between the points and there's nothing there to hold the information. So if you were to double the sample rate and keep that frequency, then you'd have something akin to an extreme bit crushed sine wave, which would look more like a square wave due to technical limitations.

I don't understand what this has to do with anything I said. Doubling the sample rate, if done properly would not create a square wave, but estimate the position of the actual analog waveform by low pass filtering and computing the results. When those points enter the analog domain they are convolved with a sinc function which will produce a continuous bandwidth limited analog waveform.

But one of them is band limits. You can't even produce such wave on the digital realm (and you'd have a hard time producing it anywhere else...). And if you could, it wouldn't even matter, at least for humans, due to the limits of our perception as I explained above.

I have no idea what you are talking about here. There is nothing impossible about creating 30.5hz sine waves in digital domain, or wave of any frequency that is not above the sample rate (although that is possible as well actually), or very tightly detuned supersaws. However, it's impossible for a 30.5hz sine wave to exist in a one second time frame.

But I still fail to grasp why you'd think of all this things when confronted with a fairly simple sound design endeavour such as a super saw kind of oscillator. I think they came to be with the Roland JP series (not sure, but think that's correct) and it was precisely the detuned mess (if you will) that gave this sound its famous thickness and the power to cut through any mix you through it at. Right? So if you were to not use detune, the saws would just be tuned together and hard synced, so only volume would rise, the moment you start detuning them, you'll get a much richer tone, filled with beating - again due to different phases, due to different wave cycle lengths - and some phase cancelation as well.

There is no complete phase cancellation if the saw waves are tuned to different frequencies. Further, you are assuming an oscillator that magically hard syncs when many waves are on top of each other. Try it without the hard syncing and you will definitely hear the extremely weak sound.

You are correct, however about the JP8000s supersaw sounding excellent.

TweakHead

02.06.2014, 07:01 PM

https://www.youtube.com/watch?v=bI2URDbI9I0

check this out! ;)

I think Access has done a wonderful job at recreating this oscillator, but you be the judge now. As far as I can tell, you don't even need hardware these days to recreate similar sounds, like Diva for instance.

You misunderstood big chunks of my earlier post, but it doesn't really matter much. Amidst all this, I still don't know what you're looking for or having trouble doing or if you're just a technical oriented person obsessed with technology's limitations (?)... So, what is it you're trying to do?

Spreader

02.06.2014, 08:10 PM

https://www.youtube.com/watch?v=bI2URDbI9I0

check this out! ;)

I have seen that clip before. I have made some comparisons myself and the JP8000 sounds a bit better IMHO than the virus for that "late 90s big detuned trance lead" sort of stuff. I am not sure exactly why that is, might be the high pass filter at the output, or the way JP uses prime numbers, while virus has symmetric detune, or the noise osc or the aliasing...

I think Access has done a wonderful job at recreating this oscillator, but you be the judge now. As far as I can tell, you don't even need hardware these days to recreate similar sounds, like Diva for instance.

Yeah, the diva recreation seems spot on. The hypersaw is a different beast from the supersaw though. It's not really a supersaw copy.

You misunderstood big chunks of my earlier post, but it doesn't really matter much. Amidst all this, I still don't know what you're looking for or having trouble doing or if you're just a technical oriented person obsessed with technology's limitations (?)... So, what is it you're trying to do?

The point of my original post was to address the first question that is usually asked when creating a supersaw. How many saw waves are wanted? First thing is of course to figure out at what point adding more saws does nothing new... That's kind of what the post was about.

TweakHead

02.06.2014, 08:19 PM

Fair enough. It won't get much further then the answer you provided yourself:

"trust your ears then"

Do you prefer the sound of the unison or hypersaw, btw?

Don't think there's any right and wrong when approaching this. It's right when it feels/sounds right, so tweak away and don't be such a tweak head :twisted:

I think you can choose 48k sample rate on the Virus ti, no? this should, presumably, push the aliasing a bit further to the right of the spectrum, making it less noticeable. Interestingly enough, the Virus has much larger definition then the good old JP does. A slight boost on the highs can help to.

TweakHead

03.06.2014, 12:50 PM

I was going to take it easy, but no.

For the sake of truth: that's a pure 30.5Hz wave, it's got more then one complete cycle, in fact there's 30 and a half. The only thing you can complain about is "discontinuity" - so go ahead and search for that.

Further more: we can ear sine waves and sine waves don't actually produce any sort of clip in their beginning, not unless reproduction of the audio starts or ends on a non zero crossing point. You may have that "feeling" with instruments whose oscillators are incapable of fixing phase start position. You can set phase initial position on the Virus, btw.

The only thing wrong with my sample is that you can't loop it and still get a perfect 30.5Hz sine wave, you'd have to go back, choose a zero cross point at the end of a cycle, and then if you were to loop that, you'd have a perfect 30.5Hz sine wave playing for as long as it makes you happy. Sampling frequency (the time you set to analyse a signal) has nothing to do with the frequency of the signal itself, so as a matter of fact you are INDEED confusing a lot of things. Mentioning spectral leakage is almost laughable at this point, since you show absolutely no accuracy in your remarks and you're failing to provide a solid basis for any of your claims. In fact, it all reads like pretentious rubbish talk to me and any informed reader. So there's a little honesty for you to digest slowly with a pinch of salt m8.

Spreader

03.06.2014, 03:35 PM

I was going to take it easy, but no.

For the sake of truth: that's a pure 30.5Hz wave, it's got more then one complete cycle, in fact there's 30 and a half. The only thing you can complain about is "discontinuity" - so go ahead and search for that.

And how do you fit more than one complete cycle in the one second window? You can't. It's not a sine wave. Easier way to think about this is trying to fit a 0.5hz sine into the window. Now it's obvious that you have a rectified sine wave on your hands, which has entirely different spectrum than a sine wave.

]
Further more: we can ear sine waves and sine waves don't actually produce any sort of clip in their beginning, not unless reproduction of the audio starts or ends on a non zero crossing point. You may have that "feeling" with instruments whose oscillators are incapable of fixing phase start position. You can set phase initial position on the Virus, btw.

This is incorrect. We cannot hear a real sine wave (just try it by setting the osc to 0 phase- you will hear the click and can even EQ it). A real sine wave would have to be infinite. When you play a shorter than that sine wave it's combined with the gate(silence) at the start. This kind of function will have to have a different spectrum than a pure sine wave and it will contain (almost) all frequencies - some of which are cut off by the converter's reconstruction filter altering the sine wave. A perfectly round finite sine wave can't even be reproduced...

]
The only thing wrong with my sample is that you can't loop it and still get a perfect 30.5Hz sine wave, you'd have to go back, choose a zero cross point at the end of a cycle, and then if you were to loop that, you'd have a perfect 30.5Hz sine wave playing for as long as it makes you happy. Sampling frequency (the time you set to analyse a signal) has nothing to do with the frequency of the signal itself, so as a matter of fact you are INDEED confusing a lot of things. Mentioning spectral leakage is almost laughable at this point, since you show absolutely no accuracy in your remarks and you're failing to provide a solid basis for any of your claims. In fact, it all reads like pretentious rubbish talk to me and any informed reader. So there's a little honesty for you to digest slowly with a pinch of salt m8.

If you loop a 30.5hz sine that is cut off at 1 second mark, you would hear exactly what I have stated - the spectrum would contain more than one frequency and the frequency of the wave would be some multiple of the 1hz. 30.5hz would be obviously impossible inside a 1 second window. Although our hearing works very differently from a spectrum analyzer so using a much smaller window works much better - try 0.01sec. The point that you can slice a wave and loop it past the window has nothing to do with any of this.

TweakHead

03.06.2014, 04:31 PM

And how do you fit more than one complete cycle in the one second window? You can't.

I can't? I just did. I've made the seemingly trivial task of placing a 30,5Hz inside a second time frame for you and - once again - that gives you 30 complete cycles and a half. Do yourself a favour and open it in an audio editor software and check for yourself.

P.S.

With the Virus setting "phase init" to 0 means there's absolutely no phase initial position at all. That's in the manual btw.

Spreader

03.06.2014, 04:55 PM

I can't? I just did. I've made the seemingly trivial task of placing a 30,5Hz inside a second time frame for you and - once again - that gives you 30 complete cycles and a half. Do yourself a favour and open it in an audio editor software and check for yourself.

No you didin't. Watch this (most important part starts around 3:07):
UGUceuhSuUE

This demonstrates that if sine is tuned to frequency that can't exist in the window - the impossible sine frequency doesn't magically emerge from the spectrum. Rather the signal consists of many different allowed frequencies. That is also called spectral leakage. 30.5hz frequency can not exist in a one second window, rather such wave would contain different allowed frequencies of 30hz, 31hz, and maybe other frequencies...

Do you understand that half a cycle sine wave IS NOT a sine wave anymore? This is where your confusion seems to stem from. Sinewave has to start and end at the same phase, if it doesn't - it's not a sine wave. And that is what the FFT shows (it's a collection of different sine waves).

TweakHead

03.06.2014, 05:24 PM

From the comments bellow:

"NTS Press1 year agoin reply to Alex Wong Chin Yung

Yes, any situation that causes the array to contain anything other than an integer number of sinusoidal periods will result in leakage."

that is the case with this sample I posted. it translates to discontinuity. discontinuity is what's causing the sample leakage you seem so obsessed about. plus, as I've stated previously, frequency domain for the FFT analysis is simply the time you choose to pick a sample, in other words. thus, one way to compensate for that, besides choosing integral numbers of the frequency of the signal - which is the same as looping one cycle btw, but I'm not even going to explain why to you - is fading in and out, 'cause if the waveform meets 0 crossing at both ends, then there's continuity and FFT is only going to pick the exact frequencies contained on the signal. How does this work for you? Happy?

But the software on that link is actually quite useful, so thanks for pointing it out.

TweakHead

03.06.2014, 05:35 PM

Spreader

03.06.2014, 05:50 PM

https://meocloud.pt/link/c794215c-b307-4689-b7ac-2bcb84b55065/Screen%20Shot%202014-06-03%20at%2018.34.02.png/

that's how I produced the sample, and that's the oscilloscope showing no signs of disturbance, plus an FFT analysis of the frequency spectrum showing a single harmonic in there. FFT resolution isn't big enough to show a perfect line on low frequencies for reasons - again - I'm not going to explain to you.

This turned out to be even a bigger mess then I thought it was. So as of now, I rest my case. Cheers. :cool:

I don't see how the picture you produced is relevant to anything you or I have said. We are talking about 30.5hz sine wave existing in a ONE SECOND WINDOW here. I have never said that 30.5hz sine wave is impossible to be produced in a completely arbitrary window, which is what you used. (Or span at the sample length of, is that around 32k samples? - which failed at showing a point as well). In fact, 2 second window has a frequency resolution of 0.5hz - no problem producing 30.5hz there.

I think in the last post you made you yourself admitted that the length of the sample determinates the frequency resolution. Isn't that what this was all about?

TweakHead

03.06.2014, 06:42 PM

Yes, it's determined by that. But FFT is just a way of analysing a signal. The frequency resolution is how long (in time) that sample is - aka as block size. In other words, we agree on that.

What I don't agree with is that there's no single complete cycle within that array. Because there is. And if you were to loop just one of them, the result would be exactly like that in the picture that I posted. So it's possible to produce a 30.5Hz sine wave within the constraints of that time span. If it was to end half a cycle earlier, the frequency of the signal would be the same, there would be no audio click, just a little silence before it starts over. And absolutely no sample leakage.

The problem is that the samples collected for the FFT analysis are not synced to the frequency of the signal, it's dependant on the refresh rate you set for it and the number of samples it collects (block size again), so there's no perfect alignment between the two things - which need to be for producing accurate results instead of displaying waveforms cut at random points that will indeed be interpreted (calculated) as a different waveform altogether and hence show some other harmonics which should not be there. So if you were to fade in and out that array, discontinuity would not be a problem, no other waveform would be calculated instead of that present in the signal, but you'd need a fairly high refresh rate to compensate for the measurement of amplitude, presumably - not even pretending to be an expert here.

But why does this matter so much to you? Even with all this in mind, let's assume we're on the same page as of now - that I finally made some sense of your words - how do you explain why note duration would be such an important thing for supersaws?

P.S. on span I've just selected high resolution, didn't manually select block size. ;)

Spreader

03.06.2014, 07:15 PM

Yes, it's determined by that. But FFT is just a way of analysing a signal. The frequency resolution is how long (in time) that sample is - aka as block size. In other words, we agree on that.

GREAT! You are correct that FFT is a way of analyzing a signal, but it is also a way of constructing signals (remember, it perfectly captures the time domain signal), and that is relevant in the supersaw case.

What I don't agree with is that there's no single complete cycle within that array. Because there is. And if you were to loop just one of them, the result would be exactly like that in the picture that I posted. So it's possible to produce a 30.5Hz sine wave within the constraints of that time span. If it was to end half a cycle earlier, the frequency of the signal would be the same, there would be no audio click, just a little silence before it starts over. And absolutely no sample leakage.

There are complete cycles of 30.5hz in that signal, but upon close look they are actually made from combination of 30 and 31hz sine waves (+maybe other). It is not possible to produce 30.5hz in the limitations of that time span. 30.5hz frequency simply does not exist in that time span. Look at the FFT analyzer in the youtube video I provided, the frequencies are points - 30hz, 31hz... Nothing exists inbetween the points, 30.5hz doesn't and can't exist in such signal, ever. Remember that the FFT construct signals from complete sine waves.

The problem is that the samples collected for the FFT analysis are not synced to the frequency of the signal, it's dependant on the refresh rate you set for it and the number of samples it collects (block size again), so there's no perfect alignment between the two things - which need to be for producing accurate results instead of displaying waveforms cut at random points that will indeed be interpreted (calculated) as a different waveform altogether and hence show some other harmonics which should not be there. So if you were to fade in and out that array, discontinuity would not be a problem, no other waveform would be calculated instead of that present in the signal, but you'd need a fairly high refresh rate to compensate for the measurement of amplitude, presumably - not even pretending to be an expert here.

This is just way of saying that the block size (length of the signal) gives you the frequency resolution. We need not to be concerned with what lies outside the signal (Unless you are in the business of predicting - it's easy to say what's outside of our window when you know it beforehand by the way).

But why does this matter so much to you? Even with all this in mind, let's assume we're on the same page as of now - that I finally made some sense of your words - how do you explain why note duration would be such an important thing for supersaws?

Let's say our frequency resolution is 1hz/1second sample and we want to use specific detune amount. Now I can determinate the exact amount of saw waves that are needed to create all the possible different combinations of the waveform. You see putting a saw wave at 30.5hz does NOTHING, that putting one at 30hz second at 31hz etc. doesn't already accomplish. Except we are dealing with saw waves here so there is actually a difference with the upper harmonics. That's why I used 16khz, and not 30hz...

Look at the FFT plot in the video I provided earlier. Once you have one sine wave at each of the FFT bins, generating more sine waves does nothing that can't be already done. That's the point...

TweakHead

03.06.2014, 08:38 PM

Correct me if I'm wrong here: but the number of points within a second is determined by our choice of sample rate, right?

so this would give us 44100 points a second, for example. If you divide this number by 30,5 you get 1445,9016393442623 points per cycle, right?

While for 30Hz you'd have 1470 points. For 31Hz it would be 1445,9016393442623 points. Of course the problem is there's nothing but whole numbers in there. So you're saying that we need more time so that there's enough points to get to a whole number and thus completing a perfect cycle, I think. And I'm saying that the mere phase difference should be enough to distinguish these pitches even within such a short time frame. That the time duration of the cycles is enough to determine pitch with precision, provided there's at least one complete cycle (???).

Spreader

03.06.2014, 10:03 PM

Correct me if I'm wrong here: but the number of points within a second is determined by our choice of sample rate, right?

For digital time domain signal, yes. For frequency domain signal, the amount of points depend on the length of the signal, the usable bandwidth on the sample rate. Analog signal is however continuous in time domain, we can't ever hear a digital signal...

so this would give us 44100 points a second, for example. If you divide this number by 30,5 you get 1445,9016393442623 points per cycle, right?

I didn't do the math, but I have no problems believing that what you are saying is correct. (but it is irrelevant).

While for 30Hz you'd have 1470 points. For 31Hz it would be 1445,9016393442623 points. Of course the problem is there's nothing but whole numbers in there. So you're saying that we need more time so that there's enough points to get to a whole number and thus completing a perfect cycle, I think. And I'm saying that the mere phase difference should be enough to distinguish these pitches even within such a short time frame. That the time duration of the cycles is enough to determine pitch with precision, provided there's at least one complete cycle (???).

No, this is not at all what I am saying. The limitations I am describing applies to analog domain and infinite sample rate digital signals as well. You can't have a 30.5hz sine wave in one second time window, be it analog, digital, infinite sample rate or whatever. The sine just won't fit in the window! A signal can only consist of sine waves that fit in the window (because by definition those are sine waves, other waves are, not sine waves - meaning that they are composed of many different sine waves)...

What you are describing is simply that periodic signals in analog domain may not be periodic in the digital domain in the sense that the sample points would repeat themselves periodically as the analog wave does. To get around this some of the digital cycles are going to have more points, this is unimportant, because they will still fit into the one second window perfectly. However, this does mean we can't take a fourier transform of a 1.0000001 second signal, because the FFT size of 44100.1 samples does not exist. So there is that type of restriction with digital, which can be overcome with oversampling though.

TweakHead

03.06.2014, 10:38 PM

I thought that interpolation had become integrated with most audio applications by now, but I'm merely presenting a guess here - as I'm not and don't pretend to be an expert on such matters, at the time being we're sailing outside the waters that are most familiar to me, so keep that in mind.

Your last sentence kind of confirms what I was trying to say with the point numbers, I can picture a scenario where a perfectly contoured waveform in analogue wouldn't translate well in digital due to it not being aligned with the points and where that signal would happen to translate into that grid - this is interesting and honestly haven't thought much about it just yet.

No developer here either, but it feels like some algorithm could be implemented to make a good guess based on some results. Meaning, that the estimative could be almost spot on based on the behaviour of the wave at certain points. Always thought that was what interpolation means and does to the waveform. But I'm guessing efforts in that are made to work within a minor error margin, but we're never talking about absolute precision here, just not so good guesses and better guesses - and we still have to factor further latency introduced by the processing of this somehow, I guess.

Spreader

04.06.2014, 02:59 AM

Your last sentence kind of confirms what I was trying to say with the point numbers, I can picture a scenario where a perfectly contoured waveform in analogue wouldn't translate well in digital due to it not being aligned with the points and where that signal would happen to translate into that grid - this is interesting and honestly haven't thought much about it just yet.

This is very simple thankfully. Any analog waveform can be captured and almost perfectly reproduced with digital sampling as long as the sample rate is 2x the lowest frequency. But like was pointed out, there is a small catch. If you were to loop the digital signal it's not the same thing as looping the analog signal. So if the original analog signal is 2.5 sample points long, looping the digital signal will produce something different than looping the analog signal. The FFT would also be different because in digtal you could not take 2.5 sample point length FFT. Once you oversample though, there is no problem. This is also why wavetable oscillators of most frequencies need to be longer than one analog cycle and possibly one reason why extreme pitch resolution requires a lot of processing power - the digital oscillators can't be periodic even if the resulting analog waveform is.

No developer here either, but it feels like some algorithm could be implemented to make a good guess based on some results. Meaning, that the estimative could be almost spot on based on the behaviour of the wave at certain points. Always thought that was what interpolation means and does to the waveform. But I'm guessing efforts in that are made to work within a minor error margin, but we're never talking about absolute precision here, just not so good guesses and better guesses - and we still have to factor further latency introduced by the processing of this somehow, I guess.

I am not a developer (far from it, lol) either. There certainly seem to be a lot of unexplored waters in audio. In the end, I guess it's not that big of a market.

Anyway, oversampling or in other words interpolation with sinc function (also goes by the name of low pass filter) gives you more points along the analog signal - well extremely close anyway (analog filters are usually worse than digital ones, and won't be linear phase). But once again, oversampling affects the frequency domain so that higher frequencies can be produced, the frequency resolution remains the same. Only way to increase frequency resolution is to have a longer signal.

If you want to know how the analog signal is constructed from digital check: http://lavryengineering.com/pdfs/lavry-sampling-theory.pdf