-
Notifications
You must be signed in to change notification settings - Fork 180
automatically determine mic delay #1017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
In UltraStar Play, we added a similar feature, however we ended up having to add audio playback of 3 different frequencies to make it more resilient to false detection caused by noise and issues caused by input signal filters. |
|
that sounds more robust, but i wouldn't really know how to integrate that in the usdx code base without rewriting much more... i went for an option with only minimal code changes (not even 100loc) |
|
i tried this on multiple x86 windows and arm linux now, works pretty well with the wave file. midi has delays... i might just remove that, and then the macos build will also work. alternative locations: one of the option menus, e.g., the microphone menu, where the microphone is selected, then we could do even a microphone specific measurement. any thoughts? |
|
moved this to the options menu. |
|
UI-wise this is a really good place for it! But it's possible this is specific to my system, but I get extremely varied results. Both mic boost and threshold also seem to affect the detected delay. Mic boost seems pretty binary: if threshold remains the same, different values of mic boost will either make it detect it, or not detect anything at all. Makes sense, seems like expected behaviour. I can't quite figure out how the Threshold values I'm observing are doing though. I suspect it's picking up keyboard noise, which gets detected as C6 for some reason. Tapping on my desk is also C6. Dragging the microphone around is C6. I think this might accidentally be the root cause of a "bug" I've been observing for quite some time now, where if I ctrl-right through a song that (probably) has a lot of C (or B / C / C# if playing on Normal), you get an insane amount of points. But there's a second thing going on (which I can't really tell is a bug in this PR, a bug/feature elsewhere in USDX, or something I just can't properly test without setting up a more involved setup where the C6 thing is less pronounced): when keeping Mic Boost the same, the higher the Threshold, the higher the (averaged) reported delay appears to be? in my particular case I can get it to fairly reliably report:
In this particular case, the ingame delay is around 140ms so the numbers are still useful, but I'd need my other setup to tell more on this. Very offtopic C6 theory but otherwise I'll forget about it |
|
the changes are fairly high... i would not be surprised by +-10ms but this is much higher. is this with a fixed cpu frequency? the cpu frequency can jump a lot and could cause delay changes like this. --> possibly we want to take this into account in future versions is this a laptop? some laptops (and desktops) have audio cards and drivers that do echo cancellation already at the level of the audio driver. on linux this is less likely the case, on windows you can actually configure post-processing options for each audio device regarding C6: that's just a random note... i might improve the audio also by playing a more flat sound without ramp up. the first version i had was using midi, but midi has more delays on my systems / is not even supported on macos builds and only partially on Linux. And I've seen very inconsistent delays there from just playing the midi note. so a wave file appears to be more robust and portable. then modulating to different notes is more tricky though. We could use an audio signal that plays two note and try to detect the transition. That would be robust, if your background noise is one of the two notes, it will still be picking up when the transition between the two notes happens. it would also be more robust to any ramp up effects |
|
This is a laptop. It does not do any hardware echo cancellation. I'm not sure if/how CPU frequency should influence this, but yes, it does ramp its frequency up and down automatically, so we can't rule it out. I'll figure out a way to get a build with this PR on my other PC and do some retests there. Using a wave file is fine, midi doesn't work for me at all, and I'd like to avoid platform-specific bits of code. For C6 I'll have to try some different microphones and also the other PC. I'll try to do it this weekend. |
|
CPU frequency has a massive influence as the latency is to a significant determined by software-processing of the audio signal. Some kernel code, then some library code, then the usdx code, all do some buffering and passing on, usdx then processes and interprets the data. I would suspect that at least 60% of the latency is in the libary + usdx code. This latency is linearly dependent on the CPU frequency. |
|
this might also explain part of the delay differences with more microphones: more code executed (this alone will also add latency of course) --> more power consumed --> less power budget for higher clock frequencies left --> lower clock frequencies --> additional higher delays due to lower execution speed |
|
i made some changes to work with 3 notes, each 900ms, very sharply cut off to avoid any transition delays. this is output from my system: In all cases the onset is more difficult as usdx pitch detection has to transition from one to the other pitch and it appears to be a bit slow with that. For the last note, the ending transition might be unreliable too as the usdx pitch detection might provide the same note a bit longer even if its not there anymore. Should the debug output be removed? |
|
i tested this a bit more and interestingly, the delay measured here is much lower than what you have to configure for the microphone delay |
|
I think SoundLib.Ping.Position needs to be used instead of relying on SDL_GetTicks since that's what is used during normal game play. But it will probably lower the measured delay even more because it removes the time between .Play and feeding the first sample into the audio driver from the equation. This time is not constant because samples are fed into the driver not on .Play, but when the next periodic sound card interrupt happens. Would it make sense to store a time stamp in TCaptureBuffer.ProcessNewBuffer that is retrieved with an optional argument to TCaptureBuffer.AnalyzeBuffer? That way we can shave off a few more milliseconds of jitter from our measurements. |
in the main menu: enter the sing menu and go back (i don't know why this is necessary) press W to play the wave file and measure the latency (shown on the top right). try this multiple times to figure out what the actual latency is. press M to play the same tone via MIDI output and measure the latency. on my system MIDI is 150ms slower than playing the wave file
|
bit of cleanup and switching to the same timing method that is used for the MicDelay... now the delays make a lot more sense to me but i still have to test it on different systems |
These are first steps in the direction of automatically determining the mic delay.
in the optionsrecord screen (where you select the microphones):
press W to play the wave file and measure the latency (shown on the top right). try this multiple times to figure out what the actual latency is.