[ opensips-Bugs-2593088 ] Crash with db_postgres and presence

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[ opensips-Bugs-2593088 ] Crash with db_postgres and presence

SourceForge.net
Bugs item #2593088, was opened at 2009-02-12 15:48
Message generated for change (Comment added) made by krokodilerian
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2593088&group_id=232389

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: trunk
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Bogdan-Andrei Iancu (bogdan_iancu)
Summary: Crash with db_postgres and presence

Initial Comment:
(submitted by [hidden email])

The problem is as follows - a query is executed, but for some reason the results from the previous one are read back from the DB driver, which in turn makes opensips to crash as the number of columns is lower and it tries to dereference stuff that doesn't exist.

Appended are the log and the backtrace:


Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_submit_query: 0x8197240 PQsendQuery(select status,reason from watchers where presentity_uri='sip:bbb%40ccc.com@10.2.30.5' AND watcher_username='aaa%40ccc.com' AND watcher_domain='10.2.30.5' AND event='presence')
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_new_result: allocate 28 bytes for result set at 0x81a0948
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_store_result: 0x8197240 PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x94c0300)
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_get_columns: 1 columns returned from the query
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_allocate_columns: allocate 16 bytes for result columns at 0x819ecd0
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_get_columns: RES_NAMES(0x819ecd8)[0]=[username]
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_get_columns: use DB_STRING result type
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_convert_rows: allocate for 1 columns 4 bytes in row buffer at 0x819d5b8
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_allocate_rows: allocate 28 bytes for result rows and values at 0x819f870
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x8197240,0,0)=[[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_convert_rows: allocated 11 bytes for row_buf[0] at 0x819d5c8
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_convert_rows: [0][0] Column[username]=[[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_str2val: converting STRING [[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:db_postgres_convert_rows: freeing row buffer at 0x819d5b8
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:free_query: PQclear(0x94c0300) result set
Feb 12 13:42:59 f-ast2 opensips[29507]: DBG:db_postgres:db_postgres_store_result: 0x8197100 PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x94bb108)
Feb 12 13:42:59 f-ast2 kernel: [842163.893756] opensips[29504]: segfault at 1 ip b77b98c6 sp bff1ab30 error 4 in presence.so[b7790000+31000]



Core was generated by `opensips'.
Program terminated with signal 11, Segmentation fault.
[New process 29504]
#0  0xb77b98c6 in get_db_subs_auth (subs=0xbff1acf4, found=0xbff1ad94) at subscribe.c:1980
1980 subs->reason.len= strlen(row_vals[1].val.string_val);
(gdb) bt
#0  0xb77b98c6 in get_db_subs_auth (subs=0xbff1acf4, found=0xbff1ad94) at subscribe.c:1980
#1  0xb77ba5cc in handle_subscribe (msg=0x819acb8, str1=0x0, str2=0x0) at subscribe.c:630
#2  0x08055591 in do_action (a=0x8195940, msg=0x819acb8) at action.c:961
#3  0x080541e2 in run_action_list (a=0x8195940, msg=0x819acb8) at action.c:139
#4  0x08056ecd in do_action (a=0x8195fe0, msg=0x819acb8) at action.c:705
#5  0x080541e2 in run_action_list (a=0x8195fe0, msg=0x819acb8) at action.c:139
#6  0x08057529 in do_action (a=0x8196048, msg=0x819acb8) at action.c:711
#7  0x080541e2 in run_action_list (a=0x81953d0, msg=0x819acb8) at action.c:139
#8  0x08056914 in do_action (a=0x8192808, msg=0x819acb8) at action.c:119
#9  0x080541e2 in run_action_list (a=0x8192808, msg=0x819acb8) at action.c:139
#10 0x08056ecd in do_action (a=0x8192870, msg=0x819acb8) at action.c:705
#11 0x080541e2 in run_action_list (a=0x818d6b8, msg=0x819acb8) at action.c:139
#12 0x08057ee1 in run_top_route (a=0x818d6b8, msg=0x819acb8) at action.c:119
#13 0x0809228c in receive_msg (
    buf=0x8164100 "SUBSCRIBE sip:bbb%40ccc.com@10.2.30.5;transport=UDP SIP/2.0\r\nVia: SIP/2.0/UDP 10.2.2.179:5060;branch=z9hG4bK-d8754z-1c3f735ee1e02d36-1---d8754z-\r\nMax-Forwards: 69\r\nContact: <sip:aaa%40ccc.com@10.2.2.1"..., len=793, rcv_info=0xbff1bad4) at receive.c:165
#14 0x080cdf4b in udp_rcv_loop () at udp_server.c:449
#15 0x0806c3af in main (argc=1, argv=0xbff1bc64) at main.c:778



----------------------------------------------------------------------

Comment By: Vasil Kolev (krokodilerian)
Date: 2009-03-17 11:24

Message:
Sorry, I completely forgot about this one. It was already fixed by the
patch that NULL-terminated everything (even zero-length strings). You can
close it.

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2009-03-17 11:20

Message:
Vasil,

can you retest - maybe this is related to the bug we fixed yesterday in
db_postgres.

Regards,
Bogdan

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2009-02-16 18:56

Message:
Anca, that's not the problem, if you get a NULL in the column, it's still
two columns (I specifically checked the code).

Also, I've seen in the logs how one query is executed and another thread
takes the result, I'm definitely sure this is the problem (logs are at the
end of the message). The solution I proposed sucks and somewhat works, and
you're about to release 1.5, but when I have some time I have some initial
approval for writing a better DB pool infrastructure for OpenSIPS (and
contributing it), so I'll probably send a proposal to the list next week.

As for the log - PID 29507 does a query and while waiting for the result,
29504 does another query and gets the result of 29507's query, and crashes
(the result is obviously the wrong one, the column is called "username").


Feb 12 13:42:59 f-ast2 opensips[29507]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 11 chars, out:
11 chars
Feb 12 13:42:59 f-ast2 opensips[29507]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 9 chars, out:
9 chars
Feb 12 13:42:59 f-ast2 opensips[29507]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 13 chars, out:
13 chars
Feb 12 13:42:59 f-ast2 opensips[29507]:
DBG:db_postgres:db_postgres_submit_query: 0x8197100 PQsendQuery(select
username from uri where username='[hidden email]' AND domain='10.2.30.5' AND
uri_user='aaa%40ccc.
com')
Feb 12 13:42:59 f-ast2 opensips[29507]: DBG:core:db_new_result: allocate
28 bytes for result set at 0x81a1c70
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_store_result: 0x81971a8
PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x94c0300)
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: 2 columns returned from the query

Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_allocate_columns:
allocate 32 bytes for result columns at 0x81a0410
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: RES_NAMES(0x81a0420)[0]=[password]

Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: use DB_STRING result type
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: RES_NAMES(0x81a0428)[1]=[rpid]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: use DB_STRING result type
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: allocate for 2 columns 8 bytes in
row buffer at 0x819ba10
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_allocate_rows:
allocate 48 bytes for result rows and values at 0x819dfe8
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x81971a8,0,0)=[1234]

Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: allocated 4 bytes for row_buf[0]
at 0x819c5a0
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: [0][0] Column[password]=[1234]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x81971a8,0,1)=[]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: allocated 0 bytes for row_buf[1]
at 0x819c5b0
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: [0][1] Column[rpid]=[]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_str2val: converting STRING [1234]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_str2val: converting STRING []
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: freeing row buffer at 0x819ba10
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:free_query:
PQclear(0x94c0300) result set
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:auth_db:get_ha1: HA1 string
calculated: 7469b80fecb342ef7a4b4daeff01fc57
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:auth:check_response: our
result = 'fadac954de8faf7fcb91f593cab55ec6'
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:auth:check_response:
authorization is OK
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:auth:post_auth: nonce index=
46
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_free_columns: freeing
result columns at 0x81a0410
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_free_rows: freeing 1
rows
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_free_rows: freeing
rows at 0x819dfe8
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_free_result: freeing
result set at 0x81a0970
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 11 chars, out:
11 chars
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 9 chars, out:
9 chars
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 13 chars, out:
13 chars
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_submit_query: 0x8197100 PQsendQuery(select
username from uri where username='[hidden email]' AND domain='10.2.30.5' AND
uri_user='aaa%40ccc.
com')
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_new_result: allocate
28 bytes for result set at 0x81a0970
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_store_result: 0x8197100
PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x94bb108)
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: 1 columns returned from the query

Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_allocate_columns:
allocate 16 bytes for result columns at 0x819ba10
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: RES_NAMES(0x819ba18)[0]=[username]

Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: use DB_STRING result type
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: allocate for 1 columns 4 bytes in
row buffer at 0x819cfe8
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_allocate_rows:
allocate 28 bytes for result rows and values at 0x81a0410
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows:
PQgetvalue(0x8197100,0,0)=[[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: allocated 11 bytes for row_buf[0]
at 0x819c9e8
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: [0][0]
Column[username]=[[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_str2val: converting STRING [[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: freeing row buffer at 0x819cfe8
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:free_query:
PQclear(0x94bb108) result set
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:uri_db:check_username: From/To
user 'aaa%40ccc.com' and auth user match
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_free_columns: freeing
result columns at 0x819ba10
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_free_rows: freeing 1
rows
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_free_rows: freeing
rows at 0x81a0410
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_free_result: freeing
result set at 0x81a0970
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:grep_sock_info: checking
if host==us: 9==9 &&  [10.2.30.5] == [10.2.30.5]
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:grep_sock_info: checking
if port 5060 matches port 5060
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:tm:t_newtran: transaction on
entrance=(nil)
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:parse_headers:
flags=ffffffffffffffff
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:get_hdr_field:
content_length=0
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:get_hdr_field: found end
of header
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:parse_headers: flags=78
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:tm:t_lookup_request: start
searching: hash=42272, isACK=0
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:tm:matching_3261: RFC3261
transaction matching failed
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:tm:t_lookup_request: no
transaction found
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:tm:run_reqin_callbacks:
trans=0xb5899c08, callback type 1, id 1 entered
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:dialog:build_new_dlg: new
dialog 0xb58be478
(c=YjQxOTNkMmM0ZWY0NTA0ZDdkMmVjMmEyODBjN2U5MDg.,f=sip:aaa%40ccc.com@10.2.30.5;transport=UDP,t=sip:bbb%40ccc.com@10.2.30.5;transport=UDP,ft=3d0afe4a)
on hash 864
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:parse_headers: flags=20
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:parse_headers: flags=400

Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:dialog:populate_leg_info:
route_set , contact sip:aaa%40ccc.com@10.2.2.179:5060, cseq 2 and bind_addr
udp:10.2.30.5:5060
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:dialog:link_dlg: ref dlg
0xb58be478 with 3 -> 3
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:dialog:run_create_callbacks:
dialog=0xb58be478
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:dialog:get_dlg_timeout:
invalid AVP value, use default timeout
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:tm:run_reqin_callbacks:
trans=0xb5899c08, callback type 1, id 0 entered
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:parse_headers:
flags=ffffffffffffffff
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:presence:search_event: start
event= [presence]
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:presence:extract_sdialog_info:
'Expires' header found, value= 60
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:presence:extract_sdialog_info:
'To' header ALREADY PARSED: <sip:bbb%40ccc.com@10.2.30.5;transport=UDP>
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:presence:extract_sdialog_info:
initial request
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:presence:extract_sdialog_info:
subs->contact= sip:aaa%40ccc.com@10.2.2.179:5060 - len = 33
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 27 chars, out:
27 chars
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 13 chars, out:
13 chars
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 9 chars, out:
9 chars
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 8 chars, out:
8 chars
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_submit_query: 0x8197240 PQsendQuery(select
status,reason from watchers where
presentity_uri='sip:bbb%40ccc.com@10.2.30.5' AND
watcher_username='aaa%40ccc.com' AND watcher_domain='10.2.30.5' AND
event='presence')
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_new_result: allocate
28 bytes for result set at 0x81a0948
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_store_result: 0x8197240
PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x94c0300)
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: 1 columns returned from the query

Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_allocate_columns:
allocate 16 bytes for result columns at 0x819ecd0
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: RES_NAMES(0x819ecd8)[0]=[username]

Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_get_columns: use DB_STRING result type
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: allocate for 1 columns 4 bytes in
row buffer at 0x819d5b8
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:core:db_allocate_rows:
allocate 28 bytes for result rows and values at 0x819f870
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows:
PQgetvalue(0x8197240,0,0)=[[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: allocated 11 bytes for row_buf[0]
at 0x819d5c8
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: [0][0]
Column[username]=[[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_str2val: converting STRING [[hidden email]]
Feb 12 13:42:59 f-ast2 opensips[29504]:
DBG:db_postgres:db_postgres_convert_rows: freeing row buffer at 0x819d5b8
Feb 12 13:42:59 f-ast2 opensips[29504]: DBG:db_postgres:free_query:
PQclear(0x94c0300) result set
Feb 12 13:42:59 f-ast2 opensips[29507]:
DBG:db_postgres:db_postgres_store_result: 0x8197100
PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x94bb108)
Feb 12 13:42:59 f-ast2 kernel: [842163.893756] opensips[29504]: segfault
at 1 ip b77b98c6 sp bff1ab30 error 4 in presence.so[b7790000+31000]
Feb 12 13:42:59 f-ast2 opensips[29507]:
DBG:db_postgres:db_postgres_get_columns: 2 columns returned from the query

Feb 12 13:42:59 f-ast2 opensips[29507]: DBG:core:db_allocate_columns:
allocate 32 bytes for result columns at 0x819d518


----------------------------------------------------------------------

Comment By: Anca Vamanu (anca_vamanu)
Date: 2009-02-16 18:37

Message:
Hi Vasil,

Thank you for the report.
It seems that the problem is another one - not what you described.
From your log messages -  there is a query for two columns: 'select
status,reason from watchers.. '. However the driver returns only one
column: '1 columns returned from the query'. Most probably the second
column - reason is empty ( because it usually is) but I expected for the
driver to return two columns and the second with a null value.
The solution seems to be checking for the number of columns returned to
avoid this problem in postgres here. However there are other places where
null columns are in the middle of the queried columns and I wonder what
happens then..
I will look after the null column cases in presence and try to avoid this
error.

regards,
Anca


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2009-02-12 20:00

Message:

The issue is as follows:


When you do a query and after that you get context switched to another
thread that does a query, it will execute its own query and see the result
of the previous one. Simple scheme:

Thread A Thread B
do_query(A)
                        do_query(B)
                        fetch_result()
...

The result is that thread B crashes, as the results returned are from the
wrong query. I've seen crashes in pretty much every module that uses the DB
connection.

My initial solution was to just add a lock in db_con_t structure and track
it, but that didn't solve the problem, which in turn led me to find that
new connections are initialized in db/db.c in db_do_init(), and there every
new connection is checked if it exists in the pool. So if you use a few
modules with the same database (as I do in my config, I have the same
database/user/pass for uri_db, usrloc, auth_db, presence, presence_xml and
xcap_client), only one entry will be created in the db_pool stuff. This in
turn leads to having multiple db_con_t structures for one DB connection,
which made the lock pretty much useless.

As a hack, I just made the db_do_init() not to check in the pool if a
connection exists, but to just open a new one. This will have a pretty
nasty effect on the number of open database connections (my setup opened
about 70), but at least it shouldn't crash this way. My ugly patch is
available at http://tiki.securax.net/mx/db_lock.diff .

A good solution is to have internal locking per DB connection which in
turn can be accessed with a member of db_func_t from the relevant
functions. I don't seem to have a better idea right now...

--
Regards,
Vasil Kolev
Attractel NV
dCAP #1324, LPIC2

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2009-02-12 17:44

Message:
I have been looking into the problem for some time and it seems like
there's a race condition in the postgresql driver somewhere. I have seen
how a query A is made, then query B hits, and then the part that sent B
sees the results of A and crashes opensips.

I'll try to find the place where the lock should happen, but I'll need to
get a bit more familiar with the types of locking primitives used.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2593088&group_id=232389

_______________________________________________
Devel mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/devel