Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
DTS-HD Master Audio and bitrate (PBR) smoothing
#1
OK so this is bloody complicated and my brain is soup but I'll try my best to lay this out as clearly as I can. I know this is an extremely long-winded post, so feel free to completely ignore me or maybe just skim it and then read the bits in more detail that you think you might be able to clarify. It might be that this is entirely the wrong forum to ask this kind of specific question, but I figure it's worth a bash before I go poking about Doom9 or wherever else. Anyway, here goes...


1: What the hell is bitrate smoothing?

I was in the process of redoing my Snowpiercer resync to make it more accurate than it already was, which involved me decoding the .dtshd 7.1 audio track off a Blu-ray Disc to PCM, inserting some samples of silence for precise sync, then re-encoding it back to .dtshd again. The thing is, when I encoded back to DTS-HD MA, something weird happened: the new file was significantly smaller, despite being encoded losslessly in the same format as source. After testing my whole workflow for any losses and verifying that there has been no audio data lost at any point in the chain, I think I've possibly worked out why, though...

The DTS-HD Master Audio Suite encoder produces 2 files:
  1. a .dtshd audio bitstream file, which has variable bitrate

  2. a .dtspbr file, which contains an analysis of the changes in bitrate throughout the .dtshd audio bitstream
When you author a disc, pro authoring software uses the PBR (Peak BitRate) analysis contained in the .dtspbr file to "smooth" the bitrate so that it's more consistent across the whole bitstream, or as the DTS encoder's manual puts it...

Quote:The Peak Bit Rate Analysis Tool analyses variable bit rate encodings (DTS-HD Master Audio encoded streams) graphically plotting the selected encoding’s bit rate over time, as if the encoding had been “smoothed” for authoring using a Peak Bit Rate scheduling utility. The smoothing process redistributes data throughout the encoded stream for a more constant flow of data during disc played back. Smoothing is performed during the authoring process of a disc.

I think the point of this is to work out where in the bitstream the bitrate suddenly jumps from low to high or high to low, and smooth (i.e. make more gradual) that transition so it's less abrupt. I'm not sure why this is necessary but I assume it's to stop some part of the hardware/decoding chain freaking out because of sudden bitrate spikes (I'd guess that decoders might deal with this badly and fail to increase the bitrate sufficiently fast enough to output losslessly, which could potentially result in audible degradation of the output). Now, if the PBR smoothing is doing that, you'll presumably end up with more areas of higher bitrates, since it's presumably slowly increasing bitrates over a longer time period rather than letting it stay low and then suddenly spike where needed; it follows that these areas of padded-up bitrate will make the file bigger. This might explain why the .dtshd file demuxed off a retail Blu-ray Disc was roughly 10% larger than my .dtshd encode fresh out of the DTS Suite, since mine had not yet gone through the bitrate-smoothing process (since this is only done when you actually author it to disc).

Now, to actually author a .dtshd stream to disc in a way that puts it through this bitrate-smoothing process, you need to take something else into account...


2: The DTS Suite header

If you author a .dtshd stream to disc using something like tsMuxeR, it'll just ignore the whole bitrate-smoothing thing altogether as far as I can tell, since it doesn't prompt for a .dtspbr input or give any indication that the output file is not going to be disc compliant. However, if you try to do it in pro authoring software like Scenarist, it will prompt for the PBR analysis to be included so that it can smooth the bitrate out in the audio stream it puts on the disc. In order for this process to work, you need to have two things:
  1. a .dtshd audio stream with a DTS-HD Master Audio Suite header attached (which goes before the actual DTS header and the audio bitstream itself)

  2. a .dtspbr file containing the PBR analysis
You can use a component of the DTS Suite called StreamTools to generate a PBR analysis if you don't have a .dtspbr file, but in order to do so, your .dtshd file needs to have this extra DTS Suite header. The trouble is, .dtshd streams demuxed from disc do not contain this header because it's stripped off during the authoring process, so you can't simply demux .dtshd from disc then plug it into StreamTools to generate a .dtspbr that you can then use to author back to disc later. I'd guess this is probably by design as an anti-piracy measure, which makes sense. (Note: I have no interest in circumventing this for the purposes of piracy; I'm only interested in correcting mis-steps made by official releases that I have bought multiple copies of, so that I can combine the best elements of each.) Furthermore, if I'm correct, it seems that .dtshd streams taken straight off a disc already have their bitrates smoothed out, so even if you did somehow manage to attach a new DTS Suite header and generate a PBR file, it would be analysing an already smoothed bitstream rather than the sort of un-smoothed variable bitrate stream that the encoder would have produced in the first place.

So, assuming that you wanted to author a compliant Blu-ray Disc using both a .dtshd stream and a .dtspbr file, you could do what I've done thus far: get the .dtshd stream off the disc, decode it to PCM, then feed it back through the official encoder in order to generate a new (non-smoothed, smaller) .dtshd audio stream and an accompanying .dtspbr bitrate analysis file to later feed to your authoring software. Right?

Well.. no, not necessarily, because there's something else that the encoder does to its output.


3: The sound of silence

Turns out that when you encode to .dtshd in the Suite, it adds something else to the start of the file in addition to the aforementioned extra header: 1024 audio samples' worth of zero byte silence. In terms of duration, this is equivalent to a little over 21 milliseconds (1024 samples / 48000 sample rate = 0.021333... seconds). So you can see what I'm talking about visually, here's a mono audio track that's been encoded to .dtshd then decoded back to PCM, both before and after I removed the added 1024 samples of silence at the start. I've highlighted the 2048 zero bytes (silence). The bit depth for this audio file is 16-bit, meaning that one audio sample consists of 16 bits, and there are 8 bits in a byte, so 16 bits / 8 = 2 bytes per audio sample. 2048 / 2 = 1024 samples.

[Image: dts-suite-silence-added.png] [Image: dts-suite-silence-removed.png]

Now, this is an easy thing to fix if you're muxing to MKV, because you can simply apply a negative delay in eac3to like so:

Code:
eac3to inputfile.dtshd outputfile.dtshd -21ms

Note that this might seem at first glance like it isn't completely accurate because 1024 samples is not exactly 21 ms, but in practice, eac3to rounds the 21 ms up so that it does happen to cut off precisely the 1024 samples we want to remove.

Anyway, if you're looking to actually author it back to a disc (and potentially put it through the bitrate smoothing process discussed above), well...
  1. It may not necessarily be as simple as deleting the 1024 samples and either reusing the existing .dtspbr file or generating another, since I'm not certain whether the header will still be correct for the contents of the file after removing 1024 samples from the start of the bitstream.

  2. It might not even be correct to remove the 1024 samples of silence if authoring a disc anyway, which brings me to my final point...

4: I sync my mind is melting

The thing is, there surely has to be some reason that the DTS Suite encoder is inserting 1024 samples of silence in the start of the bitstream in the first place. Until recently, I had absolutely no idea why that might be, and I might still be miles off on this but I may have stumbled upon a very fuzzy semblance of understanding while working on something unrelated.

I was working with a graphical (as in, images, no text) PGS subtitle stream for another film and during testing, I muxed one of my PGS streams into the appropriate AVC video file to see how it looked. The result wasn't quite right: MPC-HC played it back with every PGS image shown 2 frames later than they should be. In trying to figure out why that might be, I went back to tsMuxeR to see what the log said. I found this:

Quote:B-pyramid level 1 detected. Shift DTS to 2 frames

Firstly, DTS here doesn't mean DTS audio, it stands for Decode Time Stamps. Secondly, as far as I know, B-pyramid is a concept that relates purely to video streams, not things like PGS or audio. What I'm wondering here is whether tsMuxeR may done something to the h.264 video stream during muxing to have it pull video frames out of buffer 2 frames later than it otherwise would, and to compensate for this, also adjusted the PGS subtitle track (and possibly also the audio track, but I'll get to that in a moment) so that it also displays its images 2 frames later. However, when I played the video back in MPC-HC, I'm guessing (and it's only a semi-educated guess) that it ignores whatever tsMuxeR did to the h.264 stream and just displays the frames as it ordinarily would, but the PGS track *is* delayed by the 2 frames because however tsMuxeR achieves that differs as necessary to apply to the PGS format.

Now... how does that relate to DTS-HD and the 1024 samples of silence? Mostly, probably not much at all. However, it's made me wonder again why that 1024 samples' worth of silence (amounting to roughly half a video frame in duration) are even inserted into the bitstream in the first place. Is it stripped back out during the authoring process, possibly as part of the bitrate smoothing / PBR processing? Or might it be there because of some quirk of hardware decoding that means it actually results in more correct sync compared with the video stream?


5: TL;DR

All of this digging has brought up a lot of questions that I've tried to find answers to with... not a lot of success. I think the main ones are as follows:
  1. Is the PBR smoothing the reason that a .dtshd stream taken directly from a disc is significantly (~10%) larger than the exact same audio fresh out of the DTS-HD Master Audio Suite encoder? I'm guessing that's exactly it, but it's a guess.

  2. Although tsMuxeR will quite happily mux .dtshd to a BD folder structure, ISO or .m2ts without a .dtspbr file, which presumably means it isn't doing anything to regulate bitrate spikes in the stream, I expect burning that to disc and playing it back on hardware will result in issues e.g. the audio decoder not reacting quickly enough to the bitrate suddenly increasing and causing artefacts / rendering the output at a lower bitrate than it should. At least, I expect that would be the case for .dtshd from the encoder (e.g. that I've then stripped the header from but not PBR-smoothed). Although...

  3. Since tsMuxeR doesn't require a .dtspbr file, but a .dtshd stream taken directly from a retail disc will presumably have been already PBR-smoothed during authoring of the original disc, would simply muxing that .dtshd stream (maybe with a delay having been applied) back to a disc result in an already smoothed out audio stream that complies with the Blu-ray Disc standard? Or is it more complex than that? I'm guessing it probably is, otherwise I don't know why you'd need to do the bitrate smoothing process during authoring rather than just doing it beforehand and authoring the result to the disc instead. EDIT: Actually, yeah, I'm forgetting that the total bitrate of video and audio can't exceed the max allowed for a Blu-ray Disc, so presumably that's the reason that the smoothing is not done until the authoring stage. So I guess doing this in tsMuxeR *might* work but it might also mean that the total bitrate is exceeded if your video bitrate and audio bitrate are both very high. I wonder what, if anything, tsMuxeR does to deal with this problem? (For what it's worth, MediaInfo does not report any bitrate info about a solo .dtshd file but if it's muxed into an MKV or something it'll tell you a figure which I assume is just the average bitrate. In the case of Snowpiercer, I found this was about 5.8 Mbps, which is presumably a non-issue seeing as the video bitrate is only about 20-odd Mbps. I believe the total allowed bitrate on a Blu-ray is 54 Mbps (40 Mbps for video), so as long as video + audio are below that, I'm guessing it's fine.)

  4. Does the 1024 samples' worth (1024 / 512 = 2 DTS frames) of silence added to the start of the .dtshd bitstream by the encoder get removed again during disc authoring, or is it actually supposed to be kept in there to correct for some idiosyncrasy or another? (In other words: should I be manually removing those 1024 samples before authoring or not?) Someone elsewhere has suggested that the DTS Suite header basically tells Scenarist to skip those 2 audio frames, so I think it is effectively removed. CONFIRMED: Yes, it gets removed because the DTS Suite header contains a "codec delay" of 1024 samples.

  5. If it actually is more accurate to remove the 1024 samples of silence from the start of the bitstream before muxing, will doing so break the DTS Suite header and/or PBR analysis thus causing issues with the bitrate smoothing process that's supposed to happen during disc authoring? (This may not be relevant if the right thing to do is leave it in, if you plan to author with something like Scenarist.) CONFIRMED: Yes, the header contains an instruction to skip the first 1024 frames as mentioned above ("codec delay") so presumably just deleting them and not editing the header would screw this up. But... I wonder if I could remove the silence and also edit the header to remove the codec delay so that the same files could be used with the PBR to author a disc in pro software (i.e. not tsMuxer) *and* to mux into MKV with correct sync (since you can't just mux in the file with the 1024 sample silence to an MKV without it putting sync slightly off).
I know that's a whole lot of words, so I'll understand if nobody wants to read it. But it seems that info on this stuff is very thin on the ground and I'm very interested to know if anybody knows more about it and may be able to answer any or all of the above questions. Even if you do read through all that and have no idea, thank you for giving it a shot nonetheless!
Reply
Thanks given by: Feallan , spoRv
#2
Remove the 1024 samples before remuxing with tsMuxeR.

Code:
eac3to input output -21ms
Reply
Thanks given by:
#3
(2020-05-29, 07:39 PM)Chewtobacca Wrote: Remove the 1024 samples before remuxing with tsMuxeR.

Code:
eac3to input output -21ms

I know that's correct for tsMuxeR, what I meant was I don't know if it's correct for pro software that actually does use the PBR analysis (such as Scenarist; I think Encore does too, but the last time I tried to use Encore for a project, I almost chucked my computer out the window).

That's what I've always done thus far, but yeah, not what I was trying to clarify. Sorry if that got buried in the wall of words!

EDIT: However, that link contains other info that actually is pertinent, I don't know how I didn't find it when I looked (i.e. before I posted this)...

Quote:Source Samples : 155965840
Sample Rate : 48000Hz
Samples Per Frame : 512
Codec Delay : 1024

If that header *is* the DTS-HD Suite header I've been talking about this whole time, then that confirms part of what I've been wondering: yes the header skips the 1024 samples / 2 DTS frames of silence, so no, they're not burned to the disc. Thank you for pointing me toward that post!
Reply
Thanks given by:
#4
Yeah, I was just summarizing that post. As I'm sure you know, the 1024 samples assumes 48kHz. At 96kHz, it's 2048. At 192kHz, it's 4096.

Anyway, this is where I get off. I hope you have all the info you need. Have fun!
Reply
Thanks given by: pipefan413


Possibly Related Threads…
Thread Author Replies Views Last Post
  [Help] Sync separate audio track to mkv file JackForrester 5 439 2020-06-23, 09:53 AM
Last Post: alleycat
  DTS-HD Master 2.0: L/R or Lt/Rt? Dr. Cooper 11 1,130 2020-04-07, 11:59 PM
Last Post: spoRv
  Issue Exporting Audio in Premiere Croweyes1121 16 2,264 2020-02-28, 06:13 PM
Last Post: TomArrow
  DVMUXER - Dolby Vision with LPCM Audio bronan 25 3,895 2019-10-23, 04:07 PM
Last Post: PDB
  [Help] Problems authoring PCM wave audio in TsMuxer [SOLVED] Evit 8 3,632 2018-01-26, 04:21 PM
Last Post: PDB
  Bitrate and video encoding spoRv 0 1,516 2017-05-01, 11:19 AM
Last Post: spoRv
  Burn DVD with audio tracks. crissrudd4554 31 14,990 2016-03-16, 07:19 AM
Last Post: crissrudd4554

Forum Jump:


Users browsing this thread: 1 Guest(s)