aMule Forum

Please login or register.

Login with username, password and session length
Advanced search  

News:

We're back! (IN POG FORM)

Pages: 1 2 [3] 4

Author Topic: Amulegui is failing to connect properly to amuled after a few hours  (Read 26864 times)

xor

  • Approved Newbie
  • *
  • Karma: 4
  • Offline Offline
  • Posts: 11
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #30 on: July 05, 2008, 10:36:58 PM »

Hi.

I think I've found a solution. I attach a patch.

After some debugging hours, I think I've found was going on. amulegui has a PollTimer that does roughly this: 1. Sends a EC_OP_STAT_REQ request. 2. Sends again another EC_OP_STAT_REQ request. 3. If download view is active, it sends a EC_OP_GET_DLOAD_QUEUE request to get active files. If everything goes OK, amuled replies to these requests in the same order.

But it turns out that sometimes, I don't know why, amuled replies in wrong order, sending first the EC_OP_STAT_REQ reply, even if EC_OP_GET_DLOAD_QUEUE was requested before. When this happens the handler in charge of proccess the EC_OP_GET_DLOAD_QUEUE request receives instead the answer to EC_OP_STAT_REQ, and as it doesn't know what to do with it, it gets lost. But the worst thing is that since then all the protocol gets broken because the amuled responses are handled by the incorrect functions.

The patch I submit is a workaround to solve this situation when it happens. It does not solve the real problem, which is that packets are sent  in wrong order. Anyway, to correct the problem the patch does the following:

In amule-remote-gui.h:

- HandlePacket in CRemoteContainer now detects if an "improper" response was sent, and if so it resend it again to the right handler.
- When a DLOAD/ULOAD request is lost,  the state machine used by HandlePacket and DoRequery is getting stacked with m_state = STATUS_REQ_SENT. I've add a counter that reset the state to IDLE when DoRequery is called twice for the same request.

In amule-remote-gui.cpp:

- In PollTimer, I've removed the second request to EC_OP_STAT_REQ. This way we save some bandwidth. The info is now partially handled on the second state of PollTimer.

Finally, I've included two new functions to RemoteConnect in the EC lib: ResendLastPacket and FifoRemoveOne. Only the first one is actually used by the patch, and it resends the last packet to the last handler in the fifo, which should be the right one. The other funcion just removes that handler (not sending the packet again), and corrects the de-synchronization, but the packet is lost.

The patch was applied to the clean aMule-2.2.1 source code.

I hope this works for you people too!
Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #31 on: July 05, 2008, 11:30:21 PM »

This is very interesting. Thank you for your work! I'll look at your patch in detail, but it will take some time. (Or maybe someone else commits it tonight, who knows...  ;) )
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

Kry

  • Ex-developer
  • Retired admin
  • Hero Member
  • *****
  • Karma: -665
  • Offline Offline
  • Posts: 5795
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #32 on: July 07, 2008, 02:04:00 PM »

Unfortunately, we must first find what's wrong with the current code before hacking the behaviour.

I agree that some escurity measures to avoid aMuleGUI freaking out by some packets should be in place, but let's first make sure we don't do that, or we'll end up with a "sloppy" coded EC where errors like this are nt coded.

xor, it's a fantastic finding you did there, don't get me wrong. I just would like someone to take a closer look at the sympthoms.
Logged

xor

  • Approved Newbie
  • *
  • Karma: 4
  • Offline Offline
  • Posts: 11
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #33 on: July 07, 2008, 03:34:55 PM »

Kry, I agree with you that devs should look carefully at the code before commiting this patch, and I'm not sure if it should be commited at all. It would be much better to fix the real problem instead of just make a workaround. I've sent it just to give other people a temporary solution (starting with me  :)), and to clearly expose the problem.

On the other hand, when I debugged the amulegui <-> amuled behaviour, I dind't see anything wrong (meaning that no packets seemed to be lost) except that packets were not arriving to amuled in the order they were sent to it. I think this is not bad by itself, just amulegui should have a central handler that detects amuled replies and send them to the right handlers.
Logged

Kry

  • Ex-developer
  • Retired admin
  • Hero Member
  • *****
  • Karma: -665
  • Offline Offline
  • Posts: 5795
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #34 on: July 07, 2008, 04:02:41 PM »

On the other hand, when I debugged the amulegui <-> amuled behaviour, I dind't see anything wrong (meaning that no packets seemed to be lost) except that packets were not arriving to amuled in the order they were sent to it.

I think this is a very bad thing to see, giving that not you, or me, or anyone at this point knows why that would be the case, which points to something really, really deep going on.

Your patch is good as a temporary workaround and it may (MAY) enter the next release if the reasons for the errors haven't been fixed by then. But I don't think it would enter the SVN.

Does that make sense?
Logged

xor

  • Approved Newbie
  • *
  • Karma: 4
  • Offline Offline
  • Posts: 11
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #35 on: July 07, 2008, 06:00:47 PM »

I think this is a very bad thing to see, giving that not you, or me, or anyone at this point knows why that would be the case, which points to something really, really deep going on.

Your patch is good as a temporary workaround and it may (MAY) enter the next release if the reasons for the errors haven't been fixed by then. But I don't think it would enter the SVN.

Does that make sense?

OK, I'm not an amule dev, I don't know well how the protocol was planned and programmed, so please don't take what I've found as something 100% reliable (take it a 90% for example  ;)). Maybe what's happening is not too serious. Only the people who programmed it can tell.

On the other hand, I've test my patch in three computers (two running mac os x, intel and ppc, and another one running linux), and it seems to work fine. More people should try it first before adding it to a next release.

For me it does make a lot of more sense to include the patch for a future aMule-2.2.x release that to commit it to SVN, providing no one else submits a better solution. For the SVN a proper solution has to be found. I'm not going to be upset if my patch is not being used, I've had a lot of fun debugging amule and learning how this wonderful software was programmed. If what I've found is useful, then much better.

Logged

GonoszTopi

  • The current man in charge of most things.
  • Administrator
  • Hero Member
  • *****
  • Karma: 169
  • Offline Offline
  • Posts: 2685
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #36 on: July 07, 2008, 07:03:27 PM »

xor, we'd need a lot of users like you since we're short on developers ;)
Logged
concordia cum veritate

xor

  • Approved Newbie
  • *
  • Karma: 4
  • Offline Offline
  • Posts: 11
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #37 on: July 28, 2008, 11:55:25 PM »

Hi again.

I'd finally found the amulegui real bug.

After properly debugging an amuled <-> amulegui session, I eventually discovered that amulegui sometimes fails to send a packet when its heavily loaded receiving and processing packets from amuled. This usually happens at the startup, when amulegui asks for the shared files and at the same time tries to send a request on transferring files. If you have a lot of shared files, there is more likely to force the bug. Also, you can force it just going to the shared files window and scrolling it up and down to force amulegui to ask more information on them.

The bug is in the EC library. amulegui calls SendRequest (in RemoteConnect.cpp), and it calls first WritePacket and then OnOutput (in ECSocket.cpp). WritePacket starts checking if there is a SocketError, a if it is, it return silently. But it shouldn't return if SocketError is due to WouldBlock, this is a normal situation.

To fix the bug just locate line 681 in ECSocket.cpp and change:

   if (SocketError()) {
      return;
   }

to something like this:

   if (SocketError()) {
      if (!WouldBlock()) {
         OnError();
         return;
      }
   }

And enjoy! You can forget about my previous patch.

Some last words: even with the bug fixed, it is clear that amulegui is very sensitive to packet loosing. It is unlikely to lose packets in normal situations, but if they are lost the whole protocol gets desynchronized and it is not possible to recover it. Several solutions can solve the problem. One I've implemented is to write a central packet handler which detects amuled responses and sends them to the proper handler. If anyone is interested on this please let me know and I'll send a patch.

Logged

Stu Redman

  • Administrator
  • Hero Member
  • *****
  • Karma: 214
  • Offline Offline
  • Posts: 3739
  • Engines screaming
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #38 on: July 31, 2008, 10:20:38 PM »

Hi xor,

thank you very much for keeping your teeth in this problem - if this works out, you will sure get the credit for finally making the remote gui usable!
Everybody's busy here with 2.2.2 at the moment, and I don't think this fix will make it into there, but it will be evaluated as soon as possible.
Please post also your packet handler patch.
Logged
The image of mother goddess, lying dormant in the eyes of the dead, the sheaf of the corn is broken, end the harvest, throw the dead on the pyre -- Iron Maiden, Isle of Avalon

xor

  • Approved Newbie
  • *
  • Karma: 4
  • Offline Offline
  • Posts: 11
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #39 on: August 02, 2008, 01:44:27 PM »

The handler patch is attached here.

If a were you, for the 2.2.2 release I would include only the fix in ECSocket.cpp. It is easy to apply and it fixes the problem.

In any case, please, discard my previous patch. It's ugly and is just a workaround. This one I'm attaching is better, but it should be carefully tested.
Logged

Kry

  • Ex-developer
  • Retired admin
  • Hero Member
  • *****
  • Karma: -665
  • Offline Offline
  • Posts: 5795
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #40 on: August 02, 2008, 05:54:53 PM »

The ECSocket change will be on 2.2.2, even if I remember me changing that and then finding something odd about the behaviour... but I assume you tested it extensively.

And if not, we'll blame you.
Logged

xor

  • Approved Newbie
  • *
  • Karma: 4
  • Offline Offline
  • Posts: 11
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #41 on: August 03, 2008, 11:26:12 PM »

Sorry to hear that.

I just can say that I've tested it in a PPC Mac and a Debian Linux and it apparently works not affecting anything else. This is not an extensive test, but I cannot test it more, sorry. On the other hand, I can assure you that in is current state some packets that should be sent are not.

It's up to you to add the WouldBlock test or not. My amulegui is working fine by now, and for me it is enough to know that other people can patch and fix it if they want. If you are not sure (and being an aMule dev you have much more knowledge than I have on this), then just don't apply it and let other people to patch it if they want.
Logged

Kry

  • Ex-developer
  • Retired admin
  • Hero Member
  • *****
  • Karma: -665
  • Offline Offline
  • Posts: 5795
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #42 on: August 03, 2008, 11:30:12 PM »

It is already added to SVN actually. And will be in 2.2.2. And if anyone complains, I'm going to tell them your email address.

I'm just that much of a bastard.
Logged

GonoszTopi

  • The current man in charge of most things.
  • Administrator
  • Hero Member
  • *****
  • Karma: 169
  • Offline Offline
  • Posts: 2685
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #43 on: August 03, 2008, 11:35:46 PM »

Good ol' Kry... ;)
Logged
concordia cum veritate

xor

  • Approved Newbie
  • *
  • Karma: 4
  • Offline Offline
  • Posts: 11
Re: Amulegui is failing to connect properly to amuled after a few hours
« Reply #44 on: August 04, 2008, 12:31:54 AM »

That rocks! What  a nice experience contributing to an open source project  ;D
Logged
Pages: 1 2 [3] 4