Announcement

Collapse
No announcement yet.

tough C++ problem (lapack maybe)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • tough C++ problem (lapack maybe)

    Hello,

    I have been wrestling with a C++ program written by others... I've managed to pinpoint the issue to one method that doesn't seem to exhibit a deterministic behaviour.

    The method simply computes a linear system (796 rows, 8 columns) using calls to lapack (dgels). To verify if the input is the same, I exported the incoming matrices to binary files. For each run of the program, these files are identical.
    The inconsistence occurs after the second call to lapack, so to verify this, I copied the code to a simple standalone program, where I perform the exact same calculations (I just read in the binary files to get the data, active code is copy/paste). And this stand alone program is .... deterministic!

    So I went back to the original program, read in the same files, and again it is not deterministic.

    The only difference is that the original program is multithreaded: one thread for the gui, one for the computations. Lapack isn't thread-safe, but there are mutex locks around all the lapack calls, and at this point in the program there are no two concurrent threads using lapack.


    I have no clue where to start searching for this... Any thoughts on what could cause this erratic behaviour?


    Jörg
    pixar
    Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

  • #2
    N.B. My knowledge of C++ is very limited.

    1, Would it be possible to create a new instance of lapack.dgels for each call?


    2, Does the call to dgels return a value or simply pitch the request over the fence?
    if no return is expected then the mutex locks may not be doing anything because it might unlock before dgels really finishes.

    We have a similar problem here when programmers code things that should be object functions as event handlers. Change the speed of the computer it's running on and suddenly the code no longer works

    N.B. My knowledge of C++ is very limited. (take 2)

    chuck
    Chuck
    秋音的爸爸

    Comment


    • #3
      Could you elaborate on the first thing?
      (in the code is simply a function call to dgels_ )

      The function outputs the result via 2 parameters. But at this point, only one thread is using lapack...

      Thanks!

      Jorg
      pixar
      Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

      Comment


      • #4
        The first one asks, really, if the dgels you are using is an object with methods and such or a simple function.
        The ones I saw were object implementations where you might need to call a method to initialize it before each call. Otherwise there might be left over junk in there after the first call.
        Or alternately create, call, destroy each time it is used.


        The second asks if it is called

        1, called like a function
        mutex.lock
        ret = lapack.dgels(parameters) -- in this case the caller must wait for dgels to finish
        mutex.unlock

        2, called like a procedure
        mutex.lock
        lapack.dgels(parameters) -- in this case the caller might not wait
        mutex.unlock

        One of the ones I saw was like #2. Pitch the call over the fence. Not great if you are multi threaded.

        See in multi threaded you can get into a problem where the caller moves on and when it gets around to looking at the out parameters how does it know if the called function is finished since it ran in a different thread?

        PS
        Lapack isn't thread-safe, but there are mutex locks around all the lapack calls
        I'm not sure what good the mutex locks do if they are set and released in the caller. Can it know the status of the called function that is in a different thread? If so, then why is the lock nessesary in the first place?
        Last edited by cjolley; 4 April 2008, 15:11.
        Chuck
        秋音的爸爸

        Comment


        • #5
          There is no lapack object. The code looks like this:

          mutex.lock()
          dgels_ (parameters)
          mutex.unlock()

          The result is contained in one of the parameters (pitched over the fence ).
          But IIRC, the methods ending in underscore are macros; perhaps I should look into working with such a lapack object. do you happen to have references on that?

          On the lapack forum, they suggested moving to the last version (3.1.1, now using 3.0) and change the underlying blas to the reference blas.
          So I'll give those things a go too. At work, we already considered using OpenCV for solving the linear system.

          Jorg
          pixar
          Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

          Comment


          • #6
            I'm now in the process of rebuilding CBLAS, ATLAS and CLAPACK... Hopefully using the latest versions for them will sort things out...
            pixar
            Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

            Comment


            • #7
              The object version I found was java.
              Oh well.
              Does the version you have return a value.
              As in actual return value, not into the in/out parameters?
              If so then make sure it is being assigned. (eg "x=dgels(p)")
              That way the caller would be forced to wait for it to complete.

              If not maybe there is a pure function as opposed to macro version you can use.
              Chuck
              秋音的爸爸

              Comment


              • #8
                I don't think it has a return value...

                But I'm now trying to get the new version to work, but I'm having difficulties in compiling them. Basically, I need to compile BLAS, CBLAS, ATLAS and CLAPACK. I may have tried this the wrong way, so now I´m trying again... Any good references on how to combine these libraries (the readme files are not very clear on the subject... )


                Jörg
                pixar
                Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                Comment


                • #9
                  Looks like the updated version of ATLAS (with reference BLAS) and CLAPACK solve the issue!

                  The program still doesn't run as expected, but at least the output from dgels_ is consistent over different runs => the problem must lay elsewhere.


                  Jörg
                  pixar
                  Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                  Comment


                  • #10
                    Originally posted by VJ View Post
                    The program still doesn't run as expected...
                    Meaning you don't get the result you expect from tests?
                    Chuck
                    秋音的爸爸

                    Comment


                    • #11
                      The lapack test runs fine (as it always did).
                      The calls to lapack in our software always return the same output for the same input (this is new).

                      Our software still shows odd behaviour, but this odd behaviour is not caused by inappropriate lapack results.

                      So I have advanced a step...


                      Jörg
                      pixar
                      Dream as if you'll live forever. Live as if you'll die tomorrow. (James Dean)

                      Comment

                      Working...
                      X