Concurrent dump?

Nov 8, 2012 at 6:18 AM

Hi,

On this page: http://libopc.codeplex.com/wikipage?title=Code%20Samples&referringTitle=Documentation

It is mentioned that this library can do concurrent write which then lists an example which just does interleaved writes (i.e. it is not threaded and therefore not concurrent). I've tried to concurrently dump each part to a buffer and there are some asserts which are firing in OPC.

So does it really support concurrency?

Assertion failed: zip->io->state.buf_pos==stream->rawBuffer.state.buf_pos+stream->rawBuffer.buf_len-stream->rawBuffer.buf_ofs, file c:\tkbt\gemmellslaunchdocx\libopc\opc\zip.c, line 965

Which is caused by a call to opcContainerReadInputStream when I'm concurrently dumping the styles, numbering and document parts:

 

bool OpcPartDumper::DumpPartToBuffer(opcContainer& opcContainer, const QString& partName, QBuffer& bufferToFill)
{
   REQUIRE(bufferToFill.isOpen() == true);
   bool result = false;

   opcPart part = opcPartFind(&opcContainer, _X(partName.toUtf8().constData()), NULL, 0);
   if (part != OPC_PART_INVALID)
   {
      opcContainerInputStream* stream = opcContainerOpenInputStream(&opcContainer, part);
      if (stream != nullptr)
      {
         opc_uint8_t buf[32768]; //read it in in 32k chunks, reducing the overhead of resizing the bufferToFill when we read in big files.
         opc_uint32_t len = 0;
         while((len = opcContainerReadInputStream(stream, buf, sizeof(buf))) > 0) 
         {
            bufferToFill.write(reinterpret_cast<const char*>(buf), len);
         }
         result = true;
         opcContainerCloseInputStream(stream);
      }
      else
      {
         qWarning() << "DumpPartToBuffer: Couldn't open an opc reader for part " << partName;
      }            
   } 
   else
   {
      qWarning() << "DumpPartToBuffer: Couldn't find part " << partName << " in opc container.";
   }  

   return result;
}

I'm guessing that particular call can't be used concurrently...

 

Coordinator
Nov 8, 2012 at 4:48 PM

Hi,

Well --- it should do interleaved reads and writes!!! I guess there is a little --- bug :-(

Here is a little sample program. A modified version of opc_extract which will dump two streams interleaved:

#include <opc/opc.h>
#include <stdio.h>
#ifdef WIN32
#include <io.h>
#include <fcntl.h>
#endif
#ifdef WIN32
#include <crtdbg.h>
#endif

int main( int argc, const char* argv[] )
{
#ifdef WIN32
     _CrtSetDbgFlag (_CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF);
#endif

#ifdef WIN32
    _setmode( _fileno( stdout ), _O_BINARY ); // make sure LF are not translated to CR LF on windows...
#endif
    if (OPC_ERROR_NONE==opcInitLibrary() && 4==argc) {
        opcContainer *c=NULL;
        if (NULL!=(c=opcContainerOpen(_X(argv[1]), OPC_OPEN_READ_ONLY, NULL, NULL))) {
            opcPart part=OPC_PART_INVALID;
            opcContainerInputStream *stream[2]={NULL, NULL};
            for(int i=0;i<2;i++) {
                if ((part=opcPartFind(c, _X(argv[2+i]), NULL, 0))!=OPC_PART_INVALID) {
                    stream[i]=opcContainerOpenInputStream(c, part);;
                }
            }
            while(stream[0]!=NULL || stream[1]!=NULL) {
                for(int i=0;i<2;i++) {
                    if (NULL!=stream[i]) {
                        opc_uint8_t buf[100];
                        opc_uint32_t len=0;
                        if((len=opcContainerReadInputStream(stream[i], buf, sizeof(buf)))>0) {
                            fwrite(buf, sizeof(opc_uint8_t), len, stdout);
                        } else if (0==len) {
                            opcContainerCloseInputStream(stream[i]);
                            stream[i]=NULL;
                        } else {
                            assert(0); // not good...
                        }
                    }
                }
            }
            opcContainerClose(c, OPC_CLOSE_NOW);
        } else {
            fprintf(stderr, "ERROR: file \"%s\" could not be opened.\n", argv[1]);
        }
        opcFreeLibrary();
    } else if (2==argc) {
        fprintf(stderr, "ERROR: initialization of libopc failed.\n");    
    } else {
        fprintf(stderr, "opc_extract FILENAME PART1 PART2\n\n");
        fprintf(stderr, "Sample: opc_extract test.docx word/document.xml word/styles.xml\n");
    }
#ifdef WIN32
    OPC_ASSERT(!_CrtDumpMemoryLeaks());
#endif
    return 0;
}

I gave it a try with:

> opc_extract2 OOXMLI1.docx word/document.xml word/styles.xml

and it gave me the two streams without any assertion failure.

It would be great to understand the bug which triggers the assertion in your program. 

 

One more thing: libopc is NOT thread safe. So if you are calling the "dump" function from different thread be sure to add a MUTEX before calling libopc.

 

Hope that helps,

 

Florian

Nov 13, 2012 at 4:49 AM
flr wrote:

One more thing: libopc is NOT thread safe. So if you are calling the "dump" function from different thread be sure to add a MUTEX before calling libopc.

Well that'd be it then - I read "Concurrent Write" to mean "can write/read from two separate threads so long as they're different parts without the need for a mutex". Talk about reading into it!.... maybe don't use the term concurrent on that page - it's not concurrent (occurring at the same time) at all.

Thanks for the clarification.