[maemo-developers] Ogg vorbis/tremor dsp task questions

Thu Sep 13 16:12:01 EEST 2007

Hi,

On 9/6/07, Simon Pickering <S.G.Pickering at bath.ac.uk> wrote:
> Hello all,
>
> Don't get too excited, I'm writing code, I don't have anything working.
> I should also add that I need to check about the copyright for some of
> the ASM code I'm using before I can release anything very much, so this
> is a theoretical discussion more than anything else.
>
> There are some bits of c/pseudo-c code here (which is me getting my
> ideas down more than anything else) if anyone wants to look at the
> structures I am/was planning to use:
> http://people.bath.ac.uk/enpsgp/nokia770/dsp/vorbisdec/
>
> Anyway, my email is really to pick people's brains as to how to
> implement the split between the ARM and the DSP. I'm picking up the way
> that Tremor works as I go along, but am by no means an expert.
>
> The problem is really how to split the work across the two processors.
> Using the DSP gateway one usually sends either buffers of data or
> single word data to the DSP and it then processes them (or data in
> shared memory) and returns. It is (I believe) bad form to run a
> function on the DSP that never returns. For one thing polling must be
> disabled, and for another I'm not sure that any other DSP tasks would
> be able to run at the same time (e.g. a pcm dsp sink).
>
> Nevertheless my first try was to run the whole of the Tremor code on
> the DSP. I wrote some callbacks for use with the ov_open_callbacks()
> function so that data could be written to the shared memory buffer when
> requested by the DSP. The DSP also signals the ARM (which blocks and
> waits for signals from the DSP) when it needs more data or has pcm data
> to be read from the output buffer.
>
> This code is not completely finished - I need to sort out the block
> allocation code on the DSP side, but it's heading the right way once I
> clean it up (see link above). But... this doesn't really seem to be the
> "right way" to do this job. The DSP is called using one of the usual
> word-receive callbacks (i.e. the ARM can send one message to the DSP
> and then cannot ever again via the DSP gateway mechanisms) and then
> enters a function from which it never returns. I've no idea what this
> will do to other DSP tasks, but it just feels wrong.
>
> My next thought was to try to separate out the file opening and leave
> that on the ARM and to send the DSP complete vorbis packets to process
> and use. This should work nicely for sound data as a single vorbis
> packet is processed at one time by ov_read() and output. Therefore the
> ARM could signal the DSP that a new packet is available and that it
> should process it and then return. Unfortunately it's a bit more
> complicated as sometimes extra packets are needed to setup the
> codebooks, etc. This doesn't sound too bad in theory (the DSP could
> signal that it needs 3 packets for the decoder setup and then wait for
> the ARM to send them over, then continue), but the code is pretty well
> mixed in together (in ov_read() and the functions it calls). This is
> where I'm asking for some help/advice/pointers to useful docs/different
> code. I think that this approach is probably the cleanest split, it
> just needs either more thought on my part, or some outside input from
> vorbis/tremor experts.
>
> Since trying this approach, and being thoroughly frustrated by my lack
> of understanding of how to split up the code, I thought I'd just
> implement the *_dsp_* functions on the DSP and use wrapper functions on
> the ARM side so that the code doesn't need to be altered too much/at
> all. I had thought this would work well, but am now encountering the
> wonders of needing to copy across a vorbis_info struct (and all its
> associated pointers and data) to the DSP side (This code is at the url
> above). I suppose this is not so bad, but it is hassle and it brings me
> back to the second idea and makes me wonder if it wouldn't be better to
> do it that way (and avoid constructing these structures on the ARM-side
> at all).
>
> I should add that with all this copying (caused by creating things on
> the ARM and copying them to the DSP), one needs to perform endianness
> changes as the DSP is bigendian and the ARM is littleendian. This makes
> things a bit more complex and messy.
>
> Therefore, I'm interested in any input as to what would be the best way
> to attack this problem. If anyone needs further clarification of the
> way one needs to interface with the DSP then I'm more than happy to
> help, either on email or on IRC.

Unfortunately I can't help you either, but if you could put your code
somewhere with instructions about how to compile and use it I might
come up with some ideas.

Also I think you might find useful to check OpenMAX DL:

OpenMAX DL (Development Layer) APIs contains a comprehensive set of
audio, video and imaging functions that can be implemented and
optimized on new CPUs , hardware engines, and DSPs and then used for a
wide range of accelerated codec functionality such as MPEG-4, H.264,
MP3, AAC and JPEG.

http://www.khronos.org/openmax/

I'm witch Krischan, I also appreciate what you are doing, it's very interesting.

Cheers!

-- 
Felipe Contreras