Android and AAC
As readers of this blog know, AAC encoding holds a spot close to my heart. I'm responsible for getting the worst AAC encoder of all time included into FFmpeg and was never successful at fixing it.
The recent Android 2.3 "Gingerbread" release contains an Apache licensed AAC encoder as part of Android's new stagefright media library. I was excited that the free software community may finally have a good free AAC encoder. Unfortunately the stagefright AAC encoder seems to be little more than an optimized version of 3GPP's 26.411 fixed point reference encoder.
Origins
Looking at the source tree there is an immediate resemblance between the 3GPP code and the stagefright code:
$ ls 26411-900/26411-900-ANSI-C_source_code/3GPP_enhanced_aacPlus_etsiopsrc_200907/ETSI_aacPlusenc/etsiop_fastaacenc/src aac_ram.c bitenc.c interface.c psy_main.c stat_bits.h aac_ram.h bitenc.h interface.h psy_main.h stprepro.c aac_rom.c block_switch.c line_pe.c qc_data.h stprepro.h aac_rom.h block_switch.h line_pe.h qc_main.c tns.c aacenc.c channel_map.c ms_stereo.c qc_main.h tns.h adj_thr.c channel_map.h ms_stereo.h quantize.c tns_func.h adj_thr.h dyn_bits.c pre_echo_control.c quantize.h tns_param.c adj_thr_data.h dyn_bits.h pre_echo_control.h sf_estim.c tns_param.h band_nrg.c fft.c psy_configuration.c sf_estim.h transform.c band_nrg.h fft.h psy_configuration.h spreading.c transform.h bit_cnt.c grp_data.c psy_const.h spreading.h bit_cnt.h grp_data.h psy_data.h stat_bits.c $ ls base/media/libstagefright/codecs/aacenc/src/ aac_rom.c bitbuffer.c line_pe.c quantize.c aacenc.c bitenc.c memalign.c sf_estim.c aacenc_core.c block_switch.c ms_stereo.c spreading.c adj_thr.c channel_map.c pre_echo_control.c stat_bits.c asm/ dyn_bits.c psy_configuration.c tns.c band_nrg.c grp_data.c psy_main.c transform.c bit_cnt.c interface.c qc_main.c
As you can see almost all the files in stagefright aacenc are named identically to files in 26.403-v9.0.0. Still both are fixed point aac encoders and thus it is reasonable to expect similar names. Let's use Warren Toomy's Ctcompare tool to compare content.
./ctcompare -r 3gpp.ctf stagefright.ctf 5473 libstagefright/codecs/aacenc/src/aac_rom.c:1507-2262 ETSI_aacPlusenc/etsiop_fastaacenc/src/aac_rom.c:701-1459 2270 libstagefright/codecs/aacenc/src/aac_rom.c:1044-1338 ETSI_aacPlusenc/etsiop_fastaacenc/src/aac_rom.c:293-587 523 libstagefright/codecs/aacenc/basic_op/oper_32b.c:270-345 ETSI_aacPlusenc/etsiop_ffrlib/src/transcendent_enc.c:13-85 523 libstagefright/codecs/aacenc/basic_op/oper_32b.c:270-345 ETSI_aacPlusdec/etsiop_ffrlib/src/transcendent_enc.c:13-85 491 libstagefright/codecs/aacenc/src/bitenc.c:398-540 ETSI_aacPlusenc/etsiop_fastaacenc/src/bitenc.c:407-553 279 libstagefright/codecs/aacenc/src/aac_rom.c:1368-1403 ETSI_aacPlusenc/etsiop_fastaacenc/src/aac_rom.c:593-628 243 libstagefright/codecs/aacenc/inc/aac_rom.h:65-95 ETSI_aacPlusenc/etsiop_fastaacenc/src/aac_rom.h:38-69 218 libstagefright/codecs/aacenc/inc/bit_cnt.h:28-106 ETSI_aacPlusenc/etsiop_fastaacenc/src/bit_cnt.h:9-87 210 libstagefright/codecs/aacenc/src/aac_rom.c:2260-2347 ETSI_aacPlusenc/etsiop_fastaacenc/src/aac_rom.c:1572-1659 199 libstagefright/codecs/aacenc/basic_op/oper_32b.c:42-179 ETSI_aacPlusenc/etsioplib/oper_32b.c:45-182 199 libstagefright/codecs/aacenc/basic_op/oper_32b.c:42-179 ETSI_aacPlusdec/etsioplib/oper_32b.c:45-182 196 libstagefright/codecs/aacenc/src/sf_estim.c:831-881 ETSI_aacPlusenc/etsiop_fastaacenc/src/sf_estim.c:768-809 194 libstagefright/codecs/aacenc/inc/tns.h:31-108 ETSI_aacPlusenc/etsiop_fastaacenc/src/tns.h:12-89 190 libstagefright/codecs/aacenc/src/psy_main.c:424-451 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_main.c:387-414 165 libstagefright/codecs/aacenc/inc/qc_data.h:23-92 ETSI_aacPlusenc/etsiop_fastaacenc/src/qc_data.h:4-73 157 libstagefright/codecs/aacenc/inc/tns_func.h:31-75 ETSI_aacPlusenc/etsiop_fastaacenc/src/tns_func.h:10-54 157 libstagefright/codecs/aacenc/src/tns.c:64-96 ETSI_aacPlusenc/etsiop_fastaacenc/src/tns.c:34-65 156 libstagefright/codecs/aacenc/inc/dyn_bits.h:23-80 ETSI_aacPlusenc/etsiop_fastaacenc/src/dyn_bits.h:4-61 147 libstagefright/codecs/aacenc/src/dyn_bits.c:305-351 ETSI_aacPlusenc/etsiop_fastaacenc/src/dyn_bits.c:319-363 145 libstagefright/codecs/aacenc/src/psy_main.c:607-656 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_main.c:559-600 139 libstagefright/codecs/aacenc/src/psy_main.c:397-414 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_main.c:359-376 131 libstagefright/codecs/aacenc/src/psy_main.c:381-397 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_main.c:342-358 130 libstagefright/codecs/aacenc/src/aac_rom.c:1393-1401 ETSI_aacPlusdec/etsiop_aacdec/src/aac_rom.c:928-936 128 libstagefright/codecs/aacenc/src/block_switch.c:81-110 ETSI_aacPlusenc/etsiop_fastaacenc/src/block_switch.c:53-75 126 libstagefright/codecs/aacenc/src/psy_main.c:42-78 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_main.c:24-60 123 libstagefright/codecs/aacenc/src/stat_bits.c:28-55 ETSI_aacPlusenc/etsiop_fastaacenc/src/stat_bits.c:10-32 122 libstagefright/codecs/aacenc/inc/psy_const.h:28-76 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_const.h:9-55 119 libstagefright/codecs/aacenc/src/bitenc.c:639-664 ETSI_aacPlusenc/etsiop_fastaacenc/src/bitenc.c:632-657 119 libstagefright/codecs/aacenc/src/bitenc.c:337-396 ETSI_aacPlusenc/etsiop_fastaacenc/src/bitenc.c:339-403 118 libstagefright/codecs/aacenc/src/block_switch.c:33-77 ETSI_aacPlusenc/etsiop_fastaacenc/src/block_switch.c:14-49 113 libstagefright/codecs/aacenc/inc/psy_data.h:23-66 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_data.h:4-47 110 libstagefright/codecs/aacenc/src/bitenc.c:620-639 ETSI_aacPlusenc/etsiop_fastaacenc/src/bitenc.c:613-632 109 libstagefright/codecs/aacenc/src/psy_main.c:382-393 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_main.c:361-372 109 libstagefright/codecs/aacenc/src/psy_main.c:399-410 ETSI_aacPlusenc/etsiop_fastaacenc/src/psy_main.c:343-354 108 libstagefright/codecs/aacenc/inc/qc_data.h:93-136 ETSI_aacPlusenc/etsiop_fastaacenc/src/qc_data.h:73-116 105 libstagefright/codecs/aacenc/basic_op/typedef.h:23-63 ETSI_aacPlusdec/etsioplib/typedef.h:15-55 105 libstagefright/codecs/aacenc/basic_op/typedef.h:23-63 ETSI_aacPlusenc/etsioplib/typedef.h:15-55 104 libstagefright/codecs/aacenc/inc/adj_thr_data.h:30-69 ETSI_aacPlusenc/etsiop_fastaacenc/src/adj_thr_data.h:13-52 103 libstagefright/codecs/aacenc/src/aac_rom.c:1489-1491 ETSI_aacPlusdec/etsiop_aacdec/src/aac_rom.c:72-79 102 libstagefright/codecs/aacenc/inc/block_switch.h:30-62 ETSI_aacPlusenc/etsiop_fastaacenc/src/block_switch.h:12-44 ...The top 4 matches aac_rom.c and oper_32b.c are all data tables. Since they are both fixed point AAC encoders a lot of the data may be required by both but I'd expect to see some tables wind up in different order or slightly modified to be more useful to a particular implementation. Instead the files have large sections of identical data and identical comments. In fact the first match in both files opens with:
/* these tables are used only for counting and are stored in packed format */
This is followed by a series of tables with identical names and formatting down to the spaces. Ignoring these tables, bitenc.c, tns*, psy_main.c, and sf_estim.c all have huge portions of identical code and comments.
108 ../base/media/libstagefright/codecs/aacenc/inc/qc_data.h:93-136 ../26411-900/26411-900-ANSI-C_source_code/3GPP_enhanced_aacPlus_etsiopsrc_200907/ETSI_aacPlusenc/etsiop_fastaacenc/src/qc_data.h:73-116 Word16 staticBitsUsed; /* for verification purposes */ Word16 dynBitsUsed; /* for verification purposes */ Word16 pe; Word16 ancBitsUsed; Word16 fillBits; } QC_OUT_ELEMENT; typedef struct { QC_OUT_CHANNEL qcChannel[MAX_CHANNELS]; QC_OUT_ELEMENT qcElement; Word16 totStaticBitsUsed; /* for verification purposes */ Word16 totDynBitsUsed; /* for verification purposes */ Word16 totAncBitsUsed; /* for verification purposes */ Word16 totFillBits; Word16 alignBits; Word16 bitResTot; Word16 averageBitsTot; } QC_OUT; typedef struct { Word32 chBitrate; Word16 averageBits; /* brutto -> look ancillary.h */ Word16 maxBits; Word16 bitResLevel; Word16 maxBitResBits; Word16 relativeBits; /* Bits relative to total Bits scaled down by 2 */ } ELEMENT_BITS; typedef struct { /* this is basically struct QC_INIT */ Word16 averageBitsTot; Word16 maxBitsTot; Word16 globStatBits; Word16 nChannels; Word16 bitResTot; Word16 maxBitFac; PADDING padding; ELEMENT_BITS elementBits; ADJ_THR_STATE adjThr; ===================================== Word16 staticBitsUsed; /* for verification purposes */ Word16 dynBitsUsed; /* for verification purposes */ Word16 pe; Word16 ancBitsUsed; Word16 fillBits; } QC_OUT_ELEMENT; typedef struct { QC_OUT_CHANNEL qcChannel[MAX_CHANNELS]; QC_OUT_ELEMENT qcElement; Word16 totStaticBitsUsed; /* for verification purposes */ Word16 totDynBitsUsed; /* for verification purposes */ Word16 totAncBitsUsed; /* for verification purposes */ Word16 totFillBits; Word16 alignBits; Word16 bitResTot; Word16 averageBitsTot; } QC_OUT; typedef struct { Word32 chBitrate; Word16 averageBits; /* brutto -> look ancillary.h */ Word16 maxBits; Word16 bitResLevel; Word16 maxBitResBits; Word16 relativeBits; /* Bits relative to total Bits scaled down by 2 */ } ELEMENT_BITS; typedef struct { /* this is basically struct QC_INIT */ Word16 averageBitsTot; Word16 maxBitsTot; Word16 globStatBits; Word16 nChannels; Word16 bitResTot; Word16 maxBitFac; PADDING padding; ELEMENT_BITS elementBits; ADJ_THR_STATE adjThr;Right there we can see large runs of identical declarations down the typedeffed names and comments. As far as actual code goes and not just declations consider these three functions from stagefright's tns.c:
/** * * function name: TnsDetect * description: Calculate TNS filter and decide on TNS usage * returns: 0 if success * */ Word32 TnsDetect(TNS_DATA* tnsData, /*!< tns data structure (modified) */ TNS_CONFIG tC, /*!< tns config structure */ Word32* pScratchTns, /*!< pointer to scratch space */ const Word16 sfbOffset[], /*!< scalefactor size and table */ Word32* spectrum, /*!< spectral data */ Word16 subBlockNumber, /*!< subblock num */ Word16 blockType, /*!< blocktype (long or short) */ Word32 * sfbEnergy) /*!< sfb-wise energy */ { Word32 predictionGain; Word32 temp; Word32* pWork32 = &pScratchTns[subBlockNumber >> 8]; Word16* pWeightedSpectrum = (Word16 *)&pScratchTns[subBlockNumber >> 8]; if (tC.tnsActive) { CalcWeightedSpectrum(spectrum, pWeightedSpectrum, sfbEnergy, sfbOffset, tC.lpcStartLine, tC.lpcStopLine, tC.lpcStartBand, tC.lpcStopBand, pWork32); temp = blockType - SHORT_WINDOW; if ( temp != 0 ) { predictionGain = CalcTnsFilter( &pWeightedSpectrum[tC.lpcStartLine], tC.acfWindow, tC.lpcStopLine - tC.lpcStartLine, tC.maxOrder, tnsData->dataRaw.tnsLong.subBlockInfo.parcor); temp = predictionGain - tC.threshold; if ( temp > 0 ) { tnsData->dataRaw.tnsLong.subBlockInfo.tnsActive = 1; } else { tnsData->dataRaw.tnsLong.subBlockInfo.tnsActive = 0; } tnsData->dataRaw.tnsLong.subBlockInfo.predictionGain = predictionGain; } else{ predictionGain = CalcTnsFilter( &pWeightedSpectrum[tC.lpcStartLine], tC.acfWindow, tC.lpcStopLine - tC.lpcStartLine, tC.maxOrder, tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].parcor); temp = predictionGain - tC.threshold; if ( temp > 0 ) { tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].tnsActive = 1; } else { tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].tnsActive = 0; } tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].predictionGain = predictionGain; } } else{ temp = blockType - SHORT_WINDOW; if ( temp != 0 ) { tnsData->dataRaw.tnsLong.subBlockInfo.tnsActive = 0; tnsData->dataRaw.tnsLong.subBlockInfo.predictionGain = 0; } else { tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].tnsActive = 0; tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].predictionGain = 0; } } return(0); } /***************************************************************************** * * function name: TnsSync * description: update tns parameter * *****************************************************************************/ void TnsSync(TNS_DATA *tnsDataDest, const TNS_DATA *tnsDataSrc, const TNS_CONFIG tC, const Word16 subBlockNumber, const Word16 blockType) { TNS_SUBBLOCK_INFO *sbInfoDest; const TNS_SUBBLOCK_INFO *sbInfoSrc; Word32 i, temp; temp = blockType - SHORT_WINDOW; if ( temp != 0 ) { sbInfoDest = &tnsDataDest->dataRaw.tnsLong.subBlockInfo; sbInfoSrc = &tnsDataSrc->dataRaw.tnsLong.subBlockInfo; } else { sbInfoDest = &tnsDataDest->dataRaw.tnsShort.subBlockInfo[subBlockNumber]; sbInfoSrc = &tnsDataSrc->dataRaw.tnsShort.subBlockInfo[subBlockNumber]; } if (100*abs_s(sbInfoDest->predictionGain - sbInfoSrc->predictionGain) < (3 * sbInfoDest->predictionGain)) { sbInfoDest->tnsActive = sbInfoSrc->tnsActive; for ( i=0; i< tC.maxOrder; i++) { sbInfoDest->parcor[i] = sbInfoSrc->parcor[i]; } } } /***************************************************************************** * * function name: TnsEncode * description: do TNS filtering * returns: 0 if success * *****************************************************************************/ Word16 TnsEncode(TNS_INFO* tnsInfo, /*!< tns info structure (modified) */ TNS_DATA* tnsData, /*!< tns data structure (modified) */ Word16 numOfSfb, /*!< number of scale factor bands */ TNS_CONFIG tC, /*!< tns config structure */ Word16 lowPassLine, /*!< lowpass line */ Word32* spectrum, /*!< spectral data (modified) */ Word16 subBlockNumber, /*!< subblock num */ Word16 blockType) /*!< blocktype (long or short) */ { Word32 i; Word32 temp_s; Word32 temp; TNS_SUBBLOCK_INFO *psubBlockInfo; temp_s = blockType - SHORT_WINDOW; if ( temp_s != 0) { psubBlockInfo = &tnsData->dataRaw.tnsLong.subBlockInfo; if (psubBlockInfo->tnsActive == 0) { tnsInfo->tnsActive[subBlockNumber] = 0; return(0); } else { Parcor2Index(psubBlockInfo->parcor, tnsInfo->coef, tC.maxOrder, tC.coefRes); Index2Parcor(tnsInfo->coef, psubBlockInfo->parcor, tC.maxOrder, tC.coefRes); for (i=tC.maxOrder - 1; i>=0; i--) { temp = psubBlockInfo->parcor[i] - TNS_PARCOR_THRESH; if ( temp > 0 ) break; temp = psubBlockInfo->parcor[i] + TNS_PARCOR_THRESH; if ( temp < 0 ) break; } tnsInfo->order[subBlockNumber] = i + 1; tnsInfo->tnsActive[subBlockNumber] = 1; for (i=subBlockNumber+1; iHere are the same three functions in the same order with nearly identical implementation in the 3GPP code:tnsActive[i] = 0; } tnsInfo->coefRes[subBlockNumber] = tC.coefRes; tnsInfo->length[subBlockNumber] = numOfSfb - tC.tnsStartBand; AnalysisFilterLattice(&(spectrum[tC.tnsStartLine]), (min(tC.tnsStopLine,lowPassLine) - tC.tnsStartLine), psubBlockInfo->parcor, tnsInfo->order[subBlockNumber], &(spectrum[tC.tnsStartLine])); } } /* if (blockType!=SHORT_WINDOW) */ else /*short block*/ { psubBlockInfo = &tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber]; if (psubBlockInfo->tnsActive == 0) { tnsInfo->tnsActive[subBlockNumber] = 0; return(0); } else { Parcor2Index(psubBlockInfo->parcor, &tnsInfo->coef[subBlockNumber*TNS_MAX_ORDER_SHORT], tC.maxOrder, tC.coefRes); Index2Parcor(&tnsInfo->coef[subBlockNumber*TNS_MAX_ORDER_SHORT], psubBlockInfo->parcor, tC.maxOrder, tC.coefRes); for (i=(tC.maxOrder - 1); i>=0; i--) { temp = psubBlockInfo->parcor[i] - TNS_PARCOR_THRESH; if ( temp > 0 ) break; temp = psubBlockInfo->parcor[i] + TNS_PARCOR_THRESH; if ( temp < 0 ) break; } tnsInfo->order[subBlockNumber] = i + 1; tnsInfo->tnsActive[subBlockNumber] = 1; tnsInfo->coefRes[subBlockNumber] = tC.coefRes; tnsInfo->length[subBlockNumber] = numOfSfb - tC.tnsStartBand; AnalysisFilterLattice(&(spectrum[tC.tnsStartLine]), (tC.tnsStopLine - tC.tnsStartLine), psubBlockInfo->parcor, tnsInfo->order[subBlockNumber], &(spectrum[tC.tnsStartLine])); } } return(0); }
/*! \brief Calculate TNS filter and decide on TNS usage \return zero */ Word32 TnsDetect(TNS_DATA* tnsData, /*!< tns data structure (modified) */ TNS_CONFIG tC, /*!< tns config structure */ Word32* pScratchTns, /*!< pointer to scratch space */ const Word16 sfbOffset[], /*!< scalefactor size and table */ Word32* spectrum, /*!< spectral data */ Word16 subBlockNumber, /*!< subblock num */ Word16 blockType, /*!< blocktype (long or short) */ Word32 * sfbEnergy) /*!< sfb-wise energy */ { Word16 predictionGain; Word16 temp; Word32* pWork32 = &pScratchTns[mult(subBlockNumber,FRAME_LEN_SHORT)]; Word16* pWeightedSpectrum = (Word16 *)&pScratchTns[mult(subBlockNumber,FRAME_LEN_SHORT)]; test(); if (tC.tnsActive) { CalcWeightedSpectrum(spectrum, pWeightedSpectrum, sfbEnergy, sfbOffset, tC.lpcStartLine, tC.lpcStopLine, tC.lpcStartBand, tC.lpcStopBand, pWork32); temp = sub( blockType, SHORT_WINDOW ); test(); if ( temp != 0 ) { predictionGain = CalcTnsFilter( &pWeightedSpectrum[tC.lpcStartLine], tC.acfWindow, sub(tC.lpcStopLine,tC.lpcStartLine), tC.maxOrder, tnsData->dataRaw.tnsLong.subBlockInfo.parcor); temp = sub( predictionGain, tC.threshold ); test(); if ( temp > 0 ) { tnsData->dataRaw.tnsLong.subBlockInfo.tnsActive = 1; move16(); } else { tnsData->dataRaw.tnsLong.subBlockInfo.tnsActive = 0; move16(); } tnsData->dataRaw.tnsLong.subBlockInfo.predictionGain = predictionGain; move16(); } else{ predictionGain = CalcTnsFilter( &pWeightedSpectrum[tC.lpcStartLine], tC.acfWindow, sub(tC.lpcStopLine, tC.lpcStartLine), tC.maxOrder, tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].parcor); temp = sub( predictionGain, tC.threshold ); test(); if ( temp > 0 ) { tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].tnsActive = 1; move16(); } else { tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].tnsActive = 0; move16(); } tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].predictionGain = predictionGain; move16(); } } else{ temp = sub( blockType, SHORT_WINDOW ); test(); if ( temp != 0 ) { tnsData->dataRaw.tnsLong.subBlockInfo.tnsActive = 0; move16(); tnsData->dataRaw.tnsLong.subBlockInfo.predictionGain = 0; move16(); } else { tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].tnsActive = 0; move16(); tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].predictionGain = 0; move16(); } } return(0); } /***************************************************************************** functionname: TnsSync description: *****************************************************************************/ void TnsSync(TNS_DATA *tnsDataDest, const TNS_DATA *tnsDataSrc, const TNS_CONFIG tC, const Word16 subBlockNumber, const Word16 blockType) { TNS_SUBBLOCK_INFO *sbInfoDest; const TNS_SUBBLOCK_INFO *sbInfoSrc; Word16 i, temp; temp = sub( blockType, SHORT_WINDOW ); test(); if ( temp != 0 ) { sbInfoDest = &tnsDataDest->dataRaw.tnsLong.subBlockInfo; move32(); sbInfoSrc = &tnsDataSrc->dataRaw.tnsLong.subBlockInfo; move32(); } else { sbInfoDest = &tnsDataDest->dataRaw.tnsShort.subBlockInfo[subBlockNumber]; move32(); sbInfoSrc = &tnsDataSrc->dataRaw.tnsShort.subBlockInfo[subBlockNumber]; move32(); } if (100*abs(sbInfoDest->predictionGain - sbInfoSrc->predictionGain) < (3 * sbInfoDest->predictionGain)) { sbInfoDest->tnsActive = sbInfoSrc->tnsActive; move16(); for ( i=0; i< tC.maxOrder; i++) { sbInfoDest->parcor[i] = sbInfoSrc->parcor[i]; move32(); } } } /*! \brief do TNS filtering \return zero */ Word16 TnsEncode(TNS_INFO* tnsInfo, /*!< tns info structure (modified) */ TNS_DATA* tnsData, /*!< tns data structure (modified) */ Word16 numOfSfb, /*!< number of scale factor bands */ TNS_CONFIG tC, /*!< tns config structure */ Word16 lowPassLine, /*!< lowpass line */ Word32* spectrum, /*!< spectral data (modified) */ Word16 subBlockNumber, /*!< subblock num */ Word16 blockType) /*!< blocktype (long or short) */ { Word16 i; Word16 temp_s; Word32 temp; temp_s = sub(blockType,SHORT_WINDOW); test(); if ( temp_s != 0) { test(); if (tnsData->dataRaw.tnsLong.subBlockInfo.tnsActive == 0) { tnsInfo->tnsActive[subBlockNumber] = 0; move16(); return(0); } else { Parcor2Index(tnsData->dataRaw.tnsLong.subBlockInfo.parcor, tnsInfo->coef, tC.maxOrder, tC.coefRes); Index2Parcor(tnsInfo->coef, tnsData->dataRaw.tnsLong.subBlockInfo.parcor, tC.maxOrder, tC.coefRes); for (i=sub(tC.maxOrder,1); i>=0; i--) { temp = L_sub( tnsData->dataRaw.tnsLong.subBlockInfo.parcor[i], TNS_PARCOR_THRESH ); test(); if ( temp > 0 ) break; temp = L_add( tnsData->dataRaw.tnsLong.subBlockInfo.parcor[i], TNS_PARCOR_THRESH ); test(); if ( temp < 0 ) break; } tnsInfo->order[subBlockNumber] = add(i,1); move16(); tnsInfo->tnsActive[subBlockNumber] = 1; move16(); for (i=add(subBlockNumber,1); itnsActive[i] = 0; move16(); } tnsInfo->coefRes[subBlockNumber] = tC.coefRes; move16(); tnsInfo->length[subBlockNumber] = sub(numOfSfb,tC.tnsStartBand); move16(); AnalysisFilterLattice(&(spectrum[tC.tnsStartLine]), sub(S_min(tC.tnsStopLine,lowPassLine),tC.tnsStartLine), tnsData->dataRaw.tnsLong.subBlockInfo.parcor, tnsInfo->order[subBlockNumber], &(spectrum[tC.tnsStartLine])); } } /* if (blockType!=SHORT_WINDOW) */ else /*short block*/ { test(); if (tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].tnsActive == 0) { tnsInfo->tnsActive[subBlockNumber] = 0; move16(); return(0); } else { Parcor2Index(tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].parcor, &tnsInfo->coef[subBlockNumber*TNS_MAX_ORDER_SHORT], tC.maxOrder, tC.coefRes); Index2Parcor(&tnsInfo->coef[subBlockNumber*TNS_MAX_ORDER_SHORT], tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].parcor, tC.maxOrder, tC.coefRes); for (i=sub(tC.maxOrder,1); i>=0; i--) { temp = L_sub( tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].parcor[i], TNS_PARCOR_THRESH ); test(); if ( temp > 0 ) break; temp = L_add( tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].parcor[i], TNS_PARCOR_THRESH ); test(); if ( temp < 0 ) break; } tnsInfo->order[subBlockNumber] = add(i,1); move16(); tnsInfo->tnsActive[subBlockNumber] = 1; move16(); tnsInfo->coefRes[subBlockNumber] = tC.coefRes; move16(); tnsInfo->length[subBlockNumber] = sub(numOfSfb, tC.tnsStartBand); move16(); AnalysisFilterLattice(&(spectrum[tC.tnsStartLine]), sub(tC.tnsStopLine,tC.tnsStartLine), tnsData->dataRaw.tnsShort.subBlockInfo[subBlockNumber].parcor, tnsInfo->order[subBlockNumber], &(spectrum[tC.tnsStartLine])); } } return(0); }
I think at this point you'd have to be a crazy person to not see that the Stagefright AAC encoder is an independent implementation and not derived from 26.411.
Licensing
No where do VisualOn and Android seem to acknowledge that their encoder is derived from the 3GPP reference encoder. The only copyright headers on the Stagefright encoder are:
/* ** Copyright 2003-2010, VisualOn, Inc. ** ** Licensed under the Apache License, Version 2.0 (the "License"); ** you may not use this file except in compliance with the License. ** You may obtain a copy of the License at ** ** http://www.apache.org/licenses/LICENSE-2.0 ** ** Unless required by applicable law or agreed to in writing, software ** distributed under the License is distributed on an "AS IS" BASIS, ** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ** See the License for the specific language governing permissions and ** limitations under the License. */
The licensing of the 3GPP encoder is somewhat ambiguous on it's own. There is no mention of copyright in the 3GPP bundle except for:
Copyright Notification No part may be reproduced except as authorized by written permission. The copyright and the foregoing restriction extend to reproduction in all mediaon the cover of the documentation, and this function in the decoder:
static void display_copyright_message(void) { fprintf(stderr,"\n"); fprintf(stderr,"*************************************************************\n"); fprintf(stderr,"* Enhanced aacPlus 3GPP ETSI-op Reference Decoder\n"); fprintf(stderr,"* Build %s, %s\n", __DATE__, __TIME__); fprintf(stderr,"*\n"); fprintf(stderr,"*************************************************************\n\n"); }
It seems pretty clear to me that 3GPP intends on their reference code to be used but the terms of such use are unknown. The 3GPP code was provided by a company called Coding Technologies. Coding Technologies has since been acquired by Dolby and is called Dolby International. Dolby isn't the most open source friendly company out there.
Some 3GPP source files contain similarity to MPEG reference source files which bear the notice:
/********************************************************************** SC 29 Software Copyright Licencing Disclaimer: This software module was originally developed byand edited by in the course of development of the ISO/IEC 13818-7 and ISO/IEC 14496-3 standards for reference purposes and its performance may not have been optimized. This software module is an implementation of one or more tools as specified by the ISO/IEC 13818-7 and ISO/IEC 14496-3 standards. ISO/IEC gives users free license to this software module or modifications thereof for use in products claiming conformance to audiovisual and image-coding related ITU Recommendations and/or ISO/IEC International Standards. ISO/IEC gives users the same free license to this software module or modifications thereof for research purposes and further ISO/IEC standardisation. Those intending to use this software module in products are advised that its use may infringe existing patents. ISO/IEC have no liability for use of this software module or modifications thereof. Copyright is not released for products that do not conform to audiovisual and image-coding related ITU Recommendations and/or ISO/IEC International Standards. The original developer retains full right to modify and use the code for its own purpose, assign or donate the code to a third party and to inhibit third parties from using the code for products that do not conform to audiovisual and image-coding related ITU Recommendations and/or ISO/IEC International Standards. This copyright notice must be included in all copies or derivative works. Copyright (c) ISO/IEC 1997. **********************************************************************/
"Copyright is not released for products that do not conform to audiovisual and image-coding related ITU Recommendations and/or ISO/IEC International Standards" is generally viewed as the problematic clause in this license. This clause was problematic for LAME before they rewrote the last of the dist10 reference code and has made FAAC undistributable. To put this code under an apache license you would need to track down the copyright holders at the top and ask them to relicense. However Dolby clearly can relicense the code it owns and the code CT owns without asking anyone.
Community
From a community stand point my mind was initially boggled why the documentation is Proprietary & Confidential. Line endings are mixed and there is plenty of trailing whitepace in the source. This sort of thing wouldn't fly in many large opensource projects. Then I realized they simply bought an encoder from one of the many companies selling optimized multimedia reference code. It was probably cheaper than having employees write their own or port the reference code themselves. Still the license on their documentation doesn't give my much faith in their attention to detail on such matters.
Maybe a community effort can build on top of this like opencore-amr did with the AMR code from earlier android releases and we can finally have a decent OSS AAC encoder but I'm not going to hold my breath. Punting code over a wall usually isn't a good strategy for building community.