June 19, 2009
filed at around evening time by DrScofield in: from the grid, research
technorati tags:
QR code for this entry · average time to read 0:28 minutes

after our intoxicating stress test ten days ago we repeated the stress test today with a group of volunteers — thus, real SL clients instead of bots.

the result: 21. :-(

somewhere around the 21 avatar mark we are got stuck and the (single) region standalone system started becoming very laggy: avatars could not move anymore (or erratically at best), chat got relayed occasionally. interestingly enough we still saw new users logging in. looking at OpenSim’s log we saw that OpenSim was busy processing new user requests — a lot of new user requests.

so, the current hypothesis is that new users clog the out-pipe with large amount of texture requests, causing the existing users to starve and the system becoming unresponsive.

all content posted on these pages is an expression of my own mind. my employer is welcome to share these opinions but then again he might not want to.

11 comments »

  1. hm, 21 = 42 / 2… way to go!

    comment by matthias — June 22, 2009 @ 10:58

  2. perhaps we should run the stress tests at the restaurant at the end of the universe?

    comment by DrScofield — June 22, 2009 @ 11:37

  3. This comes somewhat as a surprise, because I remember the meeting on Dahlia’s Server from last year, sporting over 20 people without too much problems.

    I posted her mail from the list with the details here: http://tinypaste.com/07003

    Best, Dirk

    comment by Dirk Krause — June 22, 2009 @ 12:31

  4. @dirk: two comments on that:

    1. dahlia’s max was 21 avatars, our calamities started once we went past that number

    2. my suspicion is that the problem would not have occurred if the arrival rate of our users had been lower (that is, the time between login attempts had been longer). with lots of users getting logged in at once, we saw a packet explosion — probably caused by the viewers all requesting all the textures at once.

    comment by DrScofield — June 22, 2009 @ 13:11

  5. Is it possible to make the textures delivered externally ? I mean, not by the simulator itself ? The same way as website do, for example ?

    comment by Grumly — June 22, 2009 @ 14:25

  6. One difference I have noticed between libomv bots and “real” clients/viewers is the number of packets they generate for the exact same behavior. For example, if you use libomv testclient to mimic the “forward” action, it sends packets to the server indicating the up arrow is depressed. The same packets come from the Hippo viewer but the rate is much much higher. There are some example numbers I captured:

    TestClient Moving: 11 pkts/s TestClient Still: 2 pkts/s Hippo Moving: 185 pkts/s Hippo Still: 7 pkts/s

    When you consider each of 21 clients producing 185 pkts/s vs 11 pkts/s, that is a lot of extra network and scene update processing. This could be a part of the problem seen with “real” viewers. The movement resolution seems to be unnecessarily high. ~Dan

    comment by Dan Lake — June 22, 2009 @ 19:57

  7. @dan: interesting point. we are currently looking into the packet processing parts to see where we can reduce the numbers of objects created (there are a couple of places that, at least at first sight, appear a bit “trigger happy” with regards to object instantiation).

    comment by DrScofield — June 22, 2009 @ 20:04

  8. One thing that occurred to me when i logged in to what seemed a very slow grid the other day was that the server was trying to feed me all the prim data that was in view, all at the same time.

    Would it help to have the server queue prim downloads and sort by distance to the agent, to get the closest prims to render fully soonest?

    comment by Darryl — June 24, 2009 @ 04:53

  9. good idea! sean also brought that up when we discussed this yesterday. need to figure out how complex this is. :-)

    comment by DrScofield — June 24, 2009 @ 07:49

  10. I would intuitively agree with your analysis about texture requests starving existing users Dr S – this is the kind of phenomenon that I’ve seen before in OpenSim office hour meetings. Although originally it appeared to be associated simply with the login process (though maybe it really was texture starvation underneath that).

    Hence, my original thought was to change the relative thread priorities so that during login, client threads got a much lower thread priority than existing clients. However, in looking down into Mono, I discovered that thread prioritization was completely unimplemented, only the stubs existed.

    I don’t know whether changing thread priorities within the VM would make a huge amount of difference or be all that relevant on Linux (which I assume you’re using for the tests!) but there wasn’t any opportunity to try.

    AFAIR, this was back on Mono 2.0.1 so things may have changed since then.

    comment by Justin Clark-Casey — July 3, 2009 @ 00:09

  11. I would be inclined to tinker with priorities at the content level rather than at the thread level…

    comment by Darryl — July 3, 2009 @ 02:32

RSS feed for comments on this post. TrackBack URI

Leave a comment