It's been a while since the last update so I wanted to give some news on how things are going with work towards the next release.
I've spent a lot of time working on getting the HLE audio code working on the Media Engine. I've been making steady progress, but it's been taking longer than I initially expected. Fortunately, I'm very close to getting all of the audio processing moved over to the ME - in fact I believe I have just one significant bug left to fix.
The issue seems to be a very odd synchronisation bug which causes the emulator to lock up when running the audio processing asynchronously. As with many of these types of bugs, it's proving quite hard to track down because as soon as I change the code to debug the problem, the issue goes away. A true Heisenbug :(
What's particularly annoying is that the bug is stopping me from measuring how much of a difference running the audio code on the ME makes. Hopefully I'll be able to fix the bug over the Christmas break and be able to publish some timings over the new year.
As part of this work, I've also been writing a general-purpose 'job manager', which coordinates batches of work between the main CPU and the ME. The idea is to build on top of J.F.'s MediaEngine.prx to provide a simple interface for queing up and dispatching work asynchronously. When a job is added to the queue, a flag indicates whether the job is suitable for running on the ME, or whether it should just be run asynchronously on the main CPU instead.
Initially just the audio processing will run through the job manager on the ME, but eventually it should be possible run other pieces of work asynchronously too. I'm hoping that it will eventually be possible to move parts of the HLE graphics processing to run asynchronously too, but I need to investigate things a bit more first. That's a job for future releases however.
Anyway, that's all for now. I'm off to eat mince pies and watch The Great Escape on tv. Merry Christmas everyone :)
-StrmnNrmn
Showing posts with label media engine. Show all posts
Showing posts with label media engine. Show all posts
Monday, December 24, 2007
Sunday, November 25, 2007
R14 Progress
It's been a while since I talked about R14 so I wanted to post a quick update on what I've been doing.
The Media Engine work has been going well. The job manager I talked about last week is now fairly functional and handles executing the audio upsampling code in 3 different modes: synchronously, asynchronously on the main processor, and asynchronously on the ME.
It's taken me a little longer to get the audio upsampling code working smoothly on the ME. I decided to focus on this initially (rather than Azimer's Audio HLE code) as it's a lot simpler and more self contained, but getting it working on the ME without any glitches required a little bit of work. I had to rewrite the simple ring buffer I was using to be lock-free. This is straightforward when dealing with a single reader thread and a single writer thread on the same processor, but a little more care is required when the reader and writer are operating on separate cores without cache coherency. I think getting this running glitch-free has helped prepare me well for the bigger task of getting Azimer's HLE code running asynchronously on the ME. I'll be working on this next.
Besides the ME work, I've had an interesting diversion getting some new font rendering working in Daedalus. I saw on the ps2dev.org forums that BenHur had released a library for rendering text using the PSP's built in fonts. I've always been a little unhappy with Daedalus's text rendering, and thought this would be a good opportunity to improve it. Here's a screenshot of the UI using BenHur's intraFont library (click through for a better-looking unscaled version):

I think this is looking a lot better than the previous font. The drop shadows really help make the text more readable. I also support multiple fonts for the first time, so the header text actually looks like header text :)
-StrmnNrmn
The Media Engine work has been going well. The job manager I talked about last week is now fairly functional and handles executing the audio upsampling code in 3 different modes: synchronously, asynchronously on the main processor, and asynchronously on the ME.
It's taken me a little longer to get the audio upsampling code working smoothly on the ME. I decided to focus on this initially (rather than Azimer's Audio HLE code) as it's a lot simpler and more self contained, but getting it working on the ME without any glitches required a little bit of work. I had to rewrite the simple ring buffer I was using to be lock-free. This is straightforward when dealing with a single reader thread and a single writer thread on the same processor, but a little more care is required when the reader and writer are operating on separate cores without cache coherency. I think getting this running glitch-free has helped prepare me well for the bigger task of getting Azimer's HLE code running asynchronously on the ME. I'll be working on this next.
Besides the ME work, I've had an interesting diversion getting some new font rendering working in Daedalus. I saw on the ps2dev.org forums that BenHur had released a library for rendering text using the PSP's built in fonts. I've always been a little unhappy with Daedalus's text rendering, and thought this would be a good opportunity to improve it. Here's a screenshot of the UI using BenHur's intraFont library (click through for a better-looking unscaled version):

I think this is looking a lot better than the previous font. The drop shadows really help make the text more readable. I also support multiple fonts for the first time, so the header text actually looks like header text :)
-StrmnNrmn
Tuesday, November 13, 2007
Media Engine progress
Over the weekend I described my plans for getting audio list processing working on the PSP's Media Engine. I'm making some decent progress so far. I've got Daedalus loading a kernel mode PRX to handle the ME nitty gritty, and I've managed to execute some test code on the ME successfully.
I've spent some time reviewing the audio code, trying to figure out if any bits are particularly amenable to running asynchronously, and trying to figure out if there is anything that is going to cause any problems when running this code on the ME. Fortunately it looks like all of Azimer's audio code is very straightforward C so there should be no problems getting it running on the ME once the synchronisation issues are dealt with. I've also realised that alongside the audio list processing there is also some expensive 44kHz upsampling code which will run very nicely on the ME too.
I have the feeling that debugging code on the ME is going to be particularly painful, so I want to try and catch as many of the obvious synchronisation bugs as early as possible. This evening I've started writing a job manager to 'simulate' executing code on the ME. The manager simply creates a thread which sits and waits for jobs to come in, mimicing the behaviour of the mediaengineprx. Once I've got the audio list processing running correctly through the job manager, I can easily switch things over to get these jobs running on the ME in parallel to the main core. That's the plan, anyway :)
-StrmnNrmn
I've spent some time reviewing the audio code, trying to figure out if any bits are particularly amenable to running asynchronously, and trying to figure out if there is anything that is going to cause any problems when running this code on the ME. Fortunately it looks like all of Azimer's audio code is very straightforward C so there should be no problems getting it running on the ME once the synchronisation issues are dealt with. I've also realised that alongside the audio list processing there is also some expensive 44kHz upsampling code which will run very nicely on the ME too.
I have the feeling that debugging code on the ME is going to be particularly painful, so I want to try and catch as many of the obvious synchronisation bugs as early as possible. This evening I've started writing a job manager to 'simulate' executing code on the ME. The manager simply creates a thread which sits and waits for jobs to come in, mimicing the behaviour of the mediaengineprx. Once I've got the audio list processing running correctly through the job manager, I can easily switch things over to get these jobs running on the ME in parallel to the main core. That's the plan, anyway :)
-StrmnNrmn
Sunday, November 11, 2007
Media Engine
Earlier I discussed my plans for getting Daedalus's audio processing working on the PSP's Media Engine.
As I mentioned in that post, it's not just a case of changing some compiler setting to get this working. I've not spent much time investigating the ME so I may be wrong on a few of these points, but here are the current issues that I think need solving.
Firstly in order to access the ME I need to be running in kernel mode. This requires either running Daedalus in kernel mode, or (preferably) creating a kernel mode PRX that encapsulates the required functionality. I think kernel mode rules out anyone running with v1.50 firmware (hence my earlier post - please respond to the poll if you haven't already done so!) Maybe one of the more savvy psp developers out there can correct me on this? If no-one is using v1.50 any more then maybe it isn't even an issue.
Another problem is that although the ME is essentially the same processor as the main core, it has a different memory map. This means that things like the VRAM is invisible to the ME, so any code ported to run on the ME would have to be written to operate on main memory. This isn't an issue for Daedalus's audio list processing, but it would cause problems if I wanted to move display list processing to the ME too.
Touching on the memory map issue, another problem is the lack of cache coherency between the two cores. I need to be careful when accessing the same areas of memory with both cores to correctly flush and invalidate the data caches. Ideally any shared memory should be kept to a minimum, but this is easier said than done when porting existing code, rather than writing new code.
For a similar reason, any code which needs to run on the ME should avoid making any calls to the runtime library, including doing any system memory allocation. System calls are also ruled out. This is fairly easy to guarantee if you're writing new code, but again, it's a lot harder if you're porting existing code.
I think that's most of the issues from the hardware side. There are also a number of issues to be solved to do with the way that Daedalus handles audio and display list processing.
On the N64, the audio and display lists are processed asynchronously by the RSP coprocessor. In Daedalus, I can identify when these tasks are queued up for the RSP, intercept them, and process them synchronously (using high-level emulation rather than simulating the RSP execution directly).
The key thing here is that as far as the emulated N64 is concerned, audio and display list processing currently happens instantaneously. As soon as it kicks off the RSP it gets a interrupt to inform it that processing has completed. The whole process is very deterministic and I'm worried that by processing these display lists asynchronously on the ME that a number of intermittent and hard-to-debug issues will crop up. On the other hand, processing these tasks asynchronously is much closer to the behaviour of a real N64, which may fix some timing-related issues. It will also allow Daedalus to exploit the inherent parallelism that N64 roms were designed to take advantage of.
My current plan for ME audio support in R14 is:
So, that's the plan; I'll keep you updated on my progress. If anyone has any experience doing this kind of thing on the ME it would be great to hear your thoughts.
-StrmnNrmn
As I mentioned in that post, it's not just a case of changing some compiler setting to get this working. I've not spent much time investigating the ME so I may be wrong on a few of these points, but here are the current issues that I think need solving.
Firstly in order to access the ME I need to be running in kernel mode. This requires either running Daedalus in kernel mode, or (preferably) creating a kernel mode PRX that encapsulates the required functionality. I think kernel mode rules out anyone running with v1.50 firmware (hence my earlier post - please respond to the poll if you haven't already done so!) Maybe one of the more savvy psp developers out there can correct me on this? If no-one is using v1.50 any more then maybe it isn't even an issue.
Another problem is that although the ME is essentially the same processor as the main core, it has a different memory map. This means that things like the VRAM is invisible to the ME, so any code ported to run on the ME would have to be written to operate on main memory. This isn't an issue for Daedalus's audio list processing, but it would cause problems if I wanted to move display list processing to the ME too.
Touching on the memory map issue, another problem is the lack of cache coherency between the two cores. I need to be careful when accessing the same areas of memory with both cores to correctly flush and invalidate the data caches. Ideally any shared memory should be kept to a minimum, but this is easier said than done when porting existing code, rather than writing new code.
For a similar reason, any code which needs to run on the ME should avoid making any calls to the runtime library, including doing any system memory allocation. System calls are also ruled out. This is fairly easy to guarantee if you're writing new code, but again, it's a lot harder if you're porting existing code.
I think that's most of the issues from the hardware side. There are also a number of issues to be solved to do with the way that Daedalus handles audio and display list processing.
On the N64, the audio and display lists are processed asynchronously by the RSP coprocessor. In Daedalus, I can identify when these tasks are queued up for the RSP, intercept them, and process them synchronously (using high-level emulation rather than simulating the RSP execution directly).
The key thing here is that as far as the emulated N64 is concerned, audio and display list processing currently happens instantaneously. As soon as it kicks off the RSP it gets a interrupt to inform it that processing has completed. The whole process is very deterministic and I'm worried that by processing these display lists asynchronously on the ME that a number of intermittent and hard-to-debug issues will crop up. On the other hand, processing these tasks asynchronously is much closer to the behaviour of a real N64, which may fix some timing-related issues. It will also allow Daedalus to exploit the inherent parallelism that N64 roms were designed to take advantage of.
My current plan for ME audio support in R14 is:
- Create a kernel mode PRX and get Daedalus successfully loading and invoking functions (under all supported firmwares). I've just about done this.
- Add the code to support initialising and running code on the ME to the PRX. Test invoking user mode functions from the main EBOOT.PBP. I'll probably be using J.F.'s great sample code as a reference for this. Thanks J.F.!
- Rewrite the audio list processing code so that it can be invoked synchronously or asynchronously as required (via some kind of configuration option). When running asynchronously it can just be run from a separate high-priority thread to start with. I can use this to test for various synchronisation issues without going through the pain of trying to do this on the ME first.
- Audit the audio list processing code to minimise any memory accesses or ensure that they are correctly synchronised with the main core/thread. Any crt or system calls need to be eliminated or abstracted away (e.g. printfs NOP when compiled to run on the ME).
- Invoke audio list processing code from the ME.
- Cross fingers.
So, that's the plan; I'll keep you updated on my progress. If anyone has any experience doing this kind of thing on the ME it would be great to hear your thoughts.
-StrmnNrmn
R13 Issues, R14 Plans
Over the past week I've started making plans for what I want to do for R14.
To start with, R13 introduced a couple of issues which I want to fix. Firstly, a number of roms now no longer work with dynarec enabled, or show odd behaviour. For instance, Aerogauge now finishes the race as soon as the countdown completes. I've tracked this down to one of the dynarec optimisations I added in August, where I optimise fragments which jump back to themselves. This should be a 'safe' transformation, so it suggests there's a bug somewhere in my implementation. If I can't fix the bug in time for R14, I'll add a temporary setting to allow this optimisation to be disabled on a rom-by-rom basis (much like the 'dynarec stack optimisation' setting).
Secondly, it looks like something I changed for savestate support has broken the 'return to main menu' option. I added some logic to help ensure that when taking a snapshot for the savestate, the CPU is paused in a 'safe' state (i.e. no dynarec code is executing, nothing is running on the RSP, and nothing is executing in the branch delay slot.) It looks like I've messed something up which is causing the 'return to main menu' option to wait for a safe state before bailing out to the menu. Should be an easy one to fix.
Morgan suggested a nice idea in the comments, which is that I generate a thumbnail for the savestate as it is created to display alongside the slot in the UI. It's a little tricky to implement, as by the time the emulator is told to create a savestate, it has already obliterated the n64's framebuffer with the Daedalus UI. I'll have to do something quite clever like speculatively copy the n64's framebuffer into system memory every time you enter the Pause Menu, or create the screenshot on the first frame rendered after saving. Either way, I'd like to add this simple feature to R14.
Next on my list for R14 is to look at making more significant performance improvements. Over the months many people have been asking when I'd get around to implementing audio on the PSP's Media Engine. I've talked about this before, but always kept putting it off in order to work on easier optimisations.
The Media Engine is a bit of unknown territory for me. Even though it's practically identical to the main CPU, you can't just change a setting an suddenly have your code running on it. There are a number of small hurdles I have to overcome before I can get audio working on the ME, but this is my big goal for R14 (I'll save the technical discussion for the next post.) If all goes to plan this should mean that audio can always be enabled without a significant impact on framerate.
So in summary for R14: a few bug fixes, thumbnails for savestates, and audio without affecting framerate.
-StrmnNrmn
To start with, R13 introduced a couple of issues which I want to fix. Firstly, a number of roms now no longer work with dynarec enabled, or show odd behaviour. For instance, Aerogauge now finishes the race as soon as the countdown completes. I've tracked this down to one of the dynarec optimisations I added in August, where I optimise fragments which jump back to themselves. This should be a 'safe' transformation, so it suggests there's a bug somewhere in my implementation. If I can't fix the bug in time for R14, I'll add a temporary setting to allow this optimisation to be disabled on a rom-by-rom basis (much like the 'dynarec stack optimisation' setting).
Secondly, it looks like something I changed for savestate support has broken the 'return to main menu' option. I added some logic to help ensure that when taking a snapshot for the savestate, the CPU is paused in a 'safe' state (i.e. no dynarec code is executing, nothing is running on the RSP, and nothing is executing in the branch delay slot.) It looks like I've messed something up which is causing the 'return to main menu' option to wait for a safe state before bailing out to the menu. Should be an easy one to fix.
Morgan suggested a nice idea in the comments, which is that I generate a thumbnail for the savestate as it is created to display alongside the slot in the UI. It's a little tricky to implement, as by the time the emulator is told to create a savestate, it has already obliterated the n64's framebuffer with the Daedalus UI. I'll have to do something quite clever like speculatively copy the n64's framebuffer into system memory every time you enter the Pause Menu, or create the screenshot on the first frame rendered after saving. Either way, I'd like to add this simple feature to R14.
Next on my list for R14 is to look at making more significant performance improvements. Over the months many people have been asking when I'd get around to implementing audio on the PSP's Media Engine. I've talked about this before, but always kept putting it off in order to work on easier optimisations.
The Media Engine is a bit of unknown territory for me. Even though it's practically identical to the main CPU, you can't just change a setting an suddenly have your code running on it. There are a number of small hurdles I have to overcome before I can get audio working on the ME, but this is my big goal for R14 (I'll save the technical discussion for the next post.) If all goes to plan this should mean that audio can always be enabled without a significant impact on framerate.
So in summary for R14: a few bug fixes, thumbnails for savestates, and audio without affecting framerate.
-StrmnNrmn
Labels:
bugs,
daedalus,
dynarec,
media engine,
R14,
savestates
Subscribe to:
Comments (Atom)