automatically determine mic delay #1017

dgruss · 2025-06-05T20:54:23Z

These are first steps in the direction of automatically determining the mic delay.

in the optionsrecord screen (where you select the microphones):

press W to play the wave file and measure the latency (shown on the top right). try this multiple times to figure out what the actual latency is.

basisbit · 2025-06-11T20:48:09Z

In UltraStar Play, we added a similar feature, however we ended up having to add audio playback of 3 different frequencies to make it more resilient to false detection caused by noise and issues caused by input signal filters.
The audio files in UltraStar Play are in https://github.com/UltraStar-Deluxe/Play/tree/master/UltraStar%20Play/Assets/Common/Audio/SineWaveTones and the code for it can be found here: https://github.com/UltraStar-Deluxe/Play/blob/master/UltraStar%20Play/Assets/Scenes/Options/RecordingOptions/CalibrateMicDelayControl.cs
It is MIT licensed, and thus can also be used in USDX without any issues.

dgruss · 2025-06-15T10:38:02Z

that sounds more robust, but i wouldn't really know how to integrate that in the usdx code base without rewriting much more...

i went for an option with only minimal code changes (not even 100loc)

dgruss · 2025-06-28T12:38:44Z

i tried this on multiple x86 windows and arm linux now, works pretty well with the wave file. midi has delays... i might just remove that, and then the macos build will also work.
i'm not sure why the thing only works after entering the sing menu once, so this is still a bug. other than that i think it's pretty convenient.

alternative locations: one of the option menus, e.g., the microphone menu, where the microphone is selected, then we could do even a microphone specific measurement. any thoughts?

dgruss · 2025-07-20T14:15:07Z

moved this to the options menu.

barbeque-squared · 2025-07-23T14:12:24Z

UI-wise this is a really good place for it!

But it's possible this is specific to my system, but I get extremely varied results. Both mic boost and threshold also seem to affect the detected delay.

Mic boost seems pretty binary: if threshold remains the same, different values of mic boost will either make it detect it, or not detect anything at all. Makes sense, seems like expected behaviour.

I can't quite figure out how the Threshold values I'm observing are doing though. I suspect it's picking up keyboard noise, which gets detected as C6 for some reason. Tapping on my desk is also C6. Dragging the microphone around is C6. I think this might accidentally be the root cause of a "bug" I've been observing for quite some time now, where if I ctrl-right through a song that (probably) has a lot of C (or B / C / C# if playing on Normal), you get an insane amount of points.

But there's a second thing going on (which I can't really tell is a bug in this PR, a bug/feature elsewhere in USDX, or something I just can't properly test without setting up a more involved setup where the C6 thing is less pronounced): when keeping Mic Boost the same, the higher the Threshold, the higher the (averaged) reported delay appears to be? in my particular case I can get it to fairly reliably report:

10% Threshold: -5 to 80 ms (I suspect C6 issue)
15% Threshold: 60 to 150ms
20% Threshold: 110 to 200ms
any higher threshold: 150 to 250ms.

In this particular case, the ingame delay is around 140ms so the numbers are still useful, but I'd need my other setup to tell more on this.

Very offtopic C6 theory but otherwise I'll forget about it
Chances are that as soon as the signal gets above a certain threshold/volume/amplitude, the pitch detection stuff must find something. But there's no confidence check whatsoever. What I'm saying is that when a human is singing (or rapping?), there's probably one or a few very closely related frequencies/pitches that are clearly it, and it's probably some kind of normal distribution? If I hang a microphone above a traffic junction, or record a passing jet, I highly doubt all of that just "happens" to be C6.

dgruss · 2025-07-23T14:47:17Z

the changes are fairly high... i would not be surprised by +-10ms but this is much higher.

is this with a fixed cpu frequency? the cpu frequency can jump a lot and could cause delay changes like this. --> possibly we want to take this into account in future versions

is this a laptop? some laptops (and desktops) have audio cards and drivers that do echo cancellation already at the level of the audio driver. on linux this is less likely the case, on windows you can actually configure post-processing options for each audio device

regarding C6: that's just a random note... i might improve the audio also by playing a more flat sound without ramp up. the first version i had was using midi, but midi has more delays on my systems / is not even supported on macos builds and only partially on Linux. And I've seen very inconsistent delays there from just playing the midi note. so a wave file appears to be more robust and portable. then modulating to different notes is more tricky though. We could use an audio signal that plays two note and try to detect the transition. That would be robust, if your background noise is one of the two notes, it will still be picking up when the transition between the two notes happens. it would also be more robust to any ramp up effects

barbeque-squared · 2025-07-24T05:09:07Z

This is a laptop. It does not do any hardware echo cancellation. I'm not sure if/how CPU frequency should influence this, but yes, it does ramp its frequency up and down automatically, so we can't rule it out. I'll figure out a way to get a build with this PR on my other PC and do some retests there.

Using a wave file is fine, midi doesn't work for me at all, and I'd like to avoid platform-specific bits of code.

For C6 I'll have to try some different microphones and also the other PC. I'll try to do it this weekend.

dgruss · 2025-07-24T06:02:58Z

CPU frequency has a massive influence as the latency is to a significant determined by software-processing of the audio signal. Some kernel code, then some library code, then the usdx code, all do some buffering and passing on, usdx then processes and interprets the data. I would suspect that at least 60% of the latency is in the libary + usdx code. This latency is linearly dependent on the CPU frequency.
Depending on how recent your laptop is it will have a wider range. 10 year old laptops were in the range 0.8GHz to maybe 3-4GHz, more recent laptops might go up to around 5GHz and 0.4GHz on the lower end of the range.
Now if the latency is 80ms first and 60% is in the library and usdx, and it is running at 4GHz, then 48ms at 4GHz is about 192 million cycles of instructions. At 0.8GHz (due to the power budget regulating the CPU frequency this can fluctuate pretty quickly and unpredictably) running through the same instructions (because the code to interpret the audio signal didn't change) might now take 240ms.
So, yes, this could be CPU-related delays. It definitely is an issue we should look into. Possibly the game should fix the CPU frequency (which would require administrator permissions on Windows / root on Linux / not sure about macos).

dgruss · 2025-07-24T06:04:00Z

this might also explain part of the delay differences with more microphones: more code executed (this alone will also add latency of course) --> more power consumed --> less power budget for higher clock frequencies left --> lower clock frequencies --> additional higher delays due to lower execution speed

dgruss · 2025-07-30T08:13:32Z

i made some changes to work with 3 notes, each 900ms, very sharply cut off to avoid any transition delays.

this is output from my system:

INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [199 - 1019] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1079 - 1899] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2119 - 2878] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]
INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [241 - 1001] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1061 - 1882] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2102 - 2860] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]
INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [241 - 1001] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1062 - 1882] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2102 - 2860] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]
INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [242 - 1001] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1061 - 1881] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2102 - 2860] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]
INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [262 - 1001] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1061 - 1881] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2121 - 2880] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]
INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [241 - 981] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1062 - 1882] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2082 - 2860] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]
INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [242 - 1002] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1062 - 1882] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2082 - 2860] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]
INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [262 - 1002] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1062 - 1902] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2102 - 2880] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]
INFO:   Ping Notes Measured vs. Ideal: [TScreenOptionsRecord.DrawDelay]
INFO:   1. [260 - 1000] ms instead of [0 - 900] ms [TScreenOptionsRecord.DrawDelay]
INFO:   2. [1060 - 1880] ms instead of [900 - 1800] ms [TScreenOptionsRecord.DrawDelay]
INFO:   3. [2100 - 2879] ms instead of [1800 - 2700] ms [TScreenOptionsRecord.DrawDelay]

In all cases the onset is more difficult as usdx pitch detection has to transition from one to the other pitch and it appears to be a bit slow with that. For the last note, the ending transition might be unreliable too as the usdx pitch detection might provide the same note a bit longer even if its not there anymore.
So the decision on the actual delay is now based on the ending of the first two notes, which should be the most reliable.

Should the debug output be removed?

dgruss · 2025-09-02T18:28:28Z

i tested this a bit more and interestingly, the delay measured here is much lower than what you have to configure for the microphone delay

s09bQ5 · 2025-09-08T08:34:31Z

I think SoundLib.Ping.Position needs to be used instead of relying on SDL_GetTicks since that's what is used during normal game play. But it will probably lower the measured delay even more because it removes the time between .Play and feeding the first sample into the audio driver from the equation. This time is not constant because samples are fed into the driver not on .Play, but when the next periodic sound card interrupt happens.

Would it make sense to store a time stamp in TCaptureBuffer.ProcessNewBuffer that is retrieved with an optional argument to TCaptureBuffer.AnalyzeBuffer? That way we can shave off a few more milliseconds of jitter from our measurements.

in the main menu: enter the sing menu and go back (i don't know why this is necessary) press W to play the wave file and measure the latency (shown on the top right). try this multiple times to figure out what the actual latency is. press M to play the same tone via MIDI output and measure the latency. on my system MIDI is 150ms slower than playing the wave file

dgruss · 2025-09-21T08:55:37Z

bit of cleanup and switching to the same timing method that is used for the MicDelay... now the delays make a lot more sense to me but i still have to test it on different systems

dgruss force-pushed the micsync branch from cc18846 to f60a1a3 Compare July 20, 2025 14:13

dgruss and others added 5 commits September 21, 2025 10:54

move to the optionsrecord screen

3dd5a66

work with 3 notes

8efb081

..

bd4c315

switch timing method

0936f3b

dgruss force-pushed the micsync branch from bb9f8d0 to 0936f3b Compare September 21, 2025 08:54

dgruss marked this pull request as ready for review September 21, 2025 18:13

dgruss marked this pull request as draft November 9, 2025 11:25

automatically determine mic delay #1017

Are you sure you want to change the base?

automatically determine mic delay #1017

Uh oh!

Conversation

dgruss commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

basisbit commented Jun 11, 2025

Uh oh!

dgruss commented Jun 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dgruss commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dgruss commented Jul 20, 2025

Uh oh!

barbeque-squared commented Jul 23, 2025

Uh oh!

dgruss commented Jul 23, 2025

Uh oh!

barbeque-squared commented Jul 24, 2025

Uh oh!

dgruss commented Jul 24, 2025

Uh oh!

dgruss commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dgruss commented Jul 30, 2025

Uh oh!

dgruss commented Sep 2, 2025

Uh oh!

s09bQ5 commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dgruss commented Sep 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dgruss commented Jun 5, 2025 •

edited

Loading

dgruss commented Jun 15, 2025 •

edited

Loading

dgruss commented Jun 28, 2025 •

edited

Loading

dgruss commented Jul 24, 2025 •

edited

Loading

s09bQ5 commented Sep 8, 2025 •

edited

Loading