View Issue Details

IDProjectCategoryView StatusLast Update
0000564ascendcompilerpublic2015-03-29 01:49
Reporterjohn 
Assigned Tojohn 
PriorityhighSeverityminorReproducibilityhave not tried
Status resolvedResolutionfixed 
PlatformLinuxOSUbuntuOS Version12.04 LTS
Product Version0.9.8 
Target Version0.9.9Fixed in Version0.9.9 
Summary0000564: crash in 'SetRealAtomValue' on ubuntu 12.04 64-bit
DescriptionOn Ubuntu 12.04 64-bit, I get a crash when I try to run the solver on the models/johnpye/combinedcycle_fprops.a4c model in trunk.

Running 'pygtk/ascdev' as detailed in [[Building ASCEND]], I then just load the model and click solve; then I get a crash.
Additional InformationHere is what I get running through GDB:

ascxx/simulation.cpp:583 (build): System built OK
PYTHON: SOLVER is now QRSlv
CREATED SOLVERPARAMETERS
PYTHON: SET convopt to <Swig Object of type 'Value *' at 0x2cbae40>
solvers/qrslv/qrslv.c:3375 (structural_analysis): In QRSlv, got vused = 219...

Program received signal SIGSEGV, Segmentation fault.
CalcResidGivenValue (mode=0x7fffffffb54c, m=0x7fffffffb550, varnum=0x7fffffffb554, val=0x20a72d8, u=0x0, f=0x20a72d0, g=0x20a72e8) at ascend/compiler/relation_util.c:4039
4039 SetRealAtomValue(
(gdb) where
#0 CalcResidGivenValue (mode=0x7fffffffb54c, m=0x7fffffffb550, varnum=0x7fffffffb554, val=0x20a72d8, u=0x0, f=0x20a72d0, g=0x20a72e8) at ascend/compiler/relation_util.c:4039
#1 0x00007ffff47ff7f1 in zbrent (func=0x7ffff47fee2a <CalcResidGivenValue>, lowbound=0x7fffffffb5c0, upbound=0x7fffffffb5b8, mode=0x7fffffffb54c, m=0x7fffffffb550, n=0x7fffffffb554, x=0x20a72d8, u=0x0,
    f=0x20a72d0, g=0x20a72e8, tolerance=0x7fffffffb5a8, status=0x7fffffffb6d0) at ascend/compiler/rootfind.c:86
#2 0x00007ffff47ff11a in RootFind (rel=0x2d68bf0, lower_bound=0x7fffffffb5c0, upper_bound=0x7fffffffb5b8, nominal=0x7fffffffb5b0, tolerance=0x7fffffffb5a8, varnum=2, status=0x7fffffffb6d0)
    at ascend/compiler/relation_util.c:4122
#3 0x00007ffff47fca65 in RelationFindRoots (i=0x2d7f920, lower_bound=9.9999999999999995e-07, upper_bound=10000, nominal=298, tolerance=1e-08, varnum=0x7fffffffb650, able=0x7fffffffb6d0,
    nsolns=0x7fffffffb6d8) at ascend/compiler/relation_util.c:3212
#4 0x00007ffff484e479 in relman_directly_solve_new (rel=0x2e0b740, solvefor=0x2dfe890, able=0x7fffffffb6d0, nsolns=0x7fffffffb6d8, tolerance=1e-08) at ascend/system/relman.c:1041
#5 0x00007ffff48528a4 in slv_direct_solve (server=0x2d53740, rel=0x2e0b740, var=0x2dfe890, fp=0x7ffff6c7b260, epsilon=1e-08, ignore_bounds=0, scaled=0) at ascend/system/slv_common.c:208
#6 0x00007fffe09d64c7 in qrslv_iterate (server=0x2d53740, asys=0x2dfb8a0) at solvers/qrslv/qrslv.c:3844
#7 0x00007ffff4862ae8 in slv_iterate (sys=0x2d53740) at ascend/solver/solver.c:365
#8 0x00007ffff5b54ddd in Simulation::solve (this=0x2dc0a10, solver=..., reporter=...) at ascxx/simulation.cpp:835
#9 0x00007ffff5bedce7 in _wrap_Simulation_solve () from /home/john/ascend/ascxx/_ascpy.so
#10 0x000000000049c4d8 in PyEval_EvalFrameEx ()
TagsNo tags attached.

Relationships

has duplicate 0000566 new crash solving model with extpy and external relations 
related to 0000567 resolved error in sim_destroy in relation.c:UpdateInputArgsList 

Activities

john

2012-08-22 22:18

administrator   ~0000910

This bug does *not* appear in Ubuntu 12.04 32-bit.

john

2013-03-05 16:00

administrator   ~0000962

The test case (test/test solver_qrslv.bug564) shows the bug when running via GDB on Ubuntu 12.04 LTS, but does NOT appear when running via Valgrind on the same platform...???

There seems to be some kind of memory corruption going on here, or something, and since the problem doesn't show with Valgrind, it must be somehow dependent on the exact nature of the problem and the memory allocation scheme/tool/library in use at the time.

john

2015-03-26 18:31

administrator   ~0000998

Update -- this bug can also now be demonstrated using a simpler test based on model file models/test/qrslv/akash_eos.a4c. We will work on setting up a suitable test case.

john

2015-03-27 17:42

administrator   ~0000999

Update. It looks as though the variable lists are messed up somewhere in this test. Variable 'f' is not incident on eq 'tsrk.eq7'!


runqrslv.c:151 (main): Solve...
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.Tc' (0xbdbdb0) in rel 'tsrk.srk_2' (0xbe15d0)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.Tr' (0xbdbdd0) in rel 'tsrk.eq1' (0xbe13c0)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.T_degC' (0xbdbd90) in rel 'tsrk.eq3' (0xbe1420)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.omega' (0xbdbe30) in rel 'tsrk.srk_3' (0xbe1600)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.alpha' (0xbdbdf0) in rel 'tsrk.eq4' (0xbe1450)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.Pc' (0xbdbd50) in rel 'tsrk.srk_1' (0xbe15a0)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.Pr' (0xbdbd70) in rel 'tsrk.eq2' (0xbe13f0)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.beta' (0xbdbe10) in rel 'tsrk.eq6' (0xbe14b0)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.q' (0xbdbe50) in rel 'tsrk.eq5' (0xbe1480)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'tsrk.A' (0xbdbd30) in rel 'tsrk.eq9' (0xbe1540)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'Tr' (0xbdbcb0) in rel 'eqTr' (0xbe1660)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'Pr' (0xbdbc70) in rel 'eqPr' (0xbe1630)
ascend/system/slv_common.c:214 (slv_direct_solve): directly solving var 'f' (0xbdbd10) in rel 'tsrk.eq7' (0xbe14e0)

john

2015-03-29 01:49

administrator   ~0001000

Fixed in changeset 2878!

The problem is a very contorted sequence of functions

slv_direct_solve
relman_directly_solve_new
RelationFindRoots
RootFind
zbrent
CalcResidGivenValue

The problem was that CalcResidGivenValue didn't correctly match the signature for an ExtEvalFunc as expected by zbrent. A cast from unsigned long to int appears to have been taking place, which didn't seem to cause problems on 32-bit machines, but did appear to cause problems on 64-bit machines. This is because on 32-bit systems, 'int' and 'long' are the same size!

Issue History

Date Modified Username Field Change
2012-08-21 19:29 john New Issue
2012-08-21 19:29 john Summary crash on ubuntu 12.04 64-bit => crash in 'SetRealAtomValue' on ubuntu 12.04 64-bit
2012-08-22 22:18 john Note Added: 0000910
2012-09-29 11:31 john Relationship added has duplicate 0000566
2012-10-02 18:50 john Relationship added related to 0000567
2013-02-26 13:39 john Target Version 1.0 => 0.9.9
2013-02-27 13:33 john Priority normal => high
2013-03-04 23:38 svn
2013-03-05 16:00 john Note Added: 0000962
2013-03-05 16:23 svn
2015-03-26 18:31 john Note Added: 0000998
2015-03-27 17:42 john Note Added: 0000999
2015-03-29 01:49 john Note Added: 0001000
2015-03-29 01:49 john Status new => resolved
2015-03-29 01:49 john Fixed in Version => 0.9.9
2015-03-29 01:49 john Resolution open => fixed
2015-03-29 01:49 john Assigned To => john