Developers help: cross-socket bugs

Questions about Wine on Linux
Locked
Daedroth
Newbie
Newbie
Posts: 4
Joined: Fri Jun 29, 2018 7:24 pm

Developers help: cross-socket bugs

Post by Daedroth »

Houston, we have a problem! I need a help of experienced client-server application developers.
The server is a multi-threaded, UDP, each thread listens a separate socket on a separate port, consequent port numbers from StartPort to StartPort + Number_of_threads. The software could be briefly described like this:

Code: Select all

procedure ListenThread(lpParameter: Pointer); stdcall;
begin
     NT:=Integer(lpParameter); //thread num
     socket();
     bind("0.0.0.0", port := StartPort + NT);
          while not shutdown do //shutdown is changed in the main thread
          begin
               RecvFrom();
               if copy(buffer,1,10)='querytype1' then
          begin
               //get some parameters
               //generate reply
               //log to file named querytype1_port.log
          end else
          if copy(buffer,1,10)='querytype2' then
          begin
               //get some parameters
               //generate reply
               //log to file named querytype2_port.log
          end;
     end;
     Closesocket();
     CloseHandle();
end;

//Main procedure
begin
     StartPort:=20000;
     NumServers:=10;
     WSAStartup();
     for i:=0 to NumServers-1 do
     begin
          hThread[i] := CreateThread(nil, 0, @ListenThread, Pointer(I), 0, lpThreadId);
          sleep(10);
     end;
     WaitForMultipleObjects();
     WSAcleanup();
end;
The client should send querytype1, analyze the reply and later maybe send querytype2. The server logs all queries to the separate files (one file per thread, opened and closed when adding each line), counts the number of each type queries. File and socket handles are local variables.

The problem is that sometimes (under load) the packet is received and processed by the wrong thread, i.e. wrong socket. How I found it? The file log querytype2_20001 contains the record about the packet sent from some IP, I was looking the same IP in the querytype1_20001 (that query must be made before) but I've found it in querytype1_20000! Again, port number is stored in the global array of structures, Servers[NT].port each cell of which could be accessed only by one thread (NT is a local integer which is set only once during thread creation and never changes). I never noticed this bug before, only using 2 latest server installations, Debian 8 and Debian 9 created in December and now in February. Also it never happened under Windows.

Now I wrote a stress test for my app, 32 threads simultaneously sending 1000 querytype1 packets each to the even port numbers, and there are still zero counters on the odd ports, so I can't reproduce it at the same system (but different port range) artifically. But it happened yesterday according to the log files. The situation when the client will send querytype1 to one server and querytype2 to another is impossible, IP of both queries must be logged by the same server (i.e. thread/port/socket). The general number of queries logged by my server (all threads) is differs on 0.5% of some estimates from outside. Each port (thread / socket) has difference about 1% to 8% from independent estimates, and what is important, if the port receives high number of packets, the difference is negative (packets sent to this port were logged by another less-loaded threads) and ports with low number of requests receive higher values. The total number of packets was about 150000 from 50000 IPs during 24 hours, it's not so much. I worked with a single server (single port) processing about 100pps.

Could somebody guess how it could happen? Maybe some problems with packets queue or network stack?
Locked