This version (2017/05/27 13:44) is a draft.
Approvals: 0/1

[00:13:53] <temporalfox> purplefox I found why the websocket test hangs :-)

[10:45:38] <gigo1980_> hi together is it posible to have some module archtecture like in osgi ? that i can deploye only some parts in my application ?

[10:46:00] <gigo1980_> does i have to do this with verticles ?

[11:33:40] <temporalfox> gigo1980_ I think you can do that if your application is correctly designed for this purpose

[11:34:12] <gigo1980_> so it could be done with, verticles ?

[11:34:34] <gigo1980_> or do you mean, writing the vertex as osgi application ?

[12:01:31] <temporal_> gigo1980_ I mean to design your application to be modular functionnally speaking

[12:01:45] <temporal_> gigo1980_ then you can deploy the parts you want using verticle deployment

[12:02:03] <gigo1980_> that was the answer, that i whant to know ;)

[13:06:37] <purplefox> temporal_: regarding the latest PR…

[13:07:20] <purplefox> temporal_: i'm still not happy about the register handler changes

[13:07:33] <purplefox> i think half the problem with this PR is its tackling several different things

[13:07:48] <purplefox> some things are uncontroversial but others not so

[13:25:03] <temporal_> purplefox tell me what you want me to change please and I will do it

[13:25:31] <purplefox> can you separate out the different fixes/changes?

[13:26:06] <temporal_> purplefox I already spent lot of time on this …

[13:26:20] <temporal_> I mean untangling some coupled changes

[13:26:21] <purplefox> for example the register handler stuff doesn't look right to me

[13:26:23] <temporal_> and reviewing each

[13:26:25] <purplefox> but most of the rest looks ok

[13:26:29] <temporal_> will take lot of time

[13:26:39] <temporal_> I can change the register handler stuff

[13:26:45] <purplefox> ok, can you remove the register handler changes then?

[13:26:53] <temporal_> no problem

[13:26:58] <purplefox> then apply them in a later pr?

[13:27:06] <temporal_> ok

[13:27:58] <purplefox> stuff like this in core i need to be sure it's not breaking stuff, and if there are many things in a single PR it gets very confusing

[13:28:04] <temporal_> I agree :-)

[13:29:22] <purplefox> also some of the changes for executeonIO i am not sure about…

[13:29:54] <temporal_> I did my best on executeOnIO, they were causing hangs in the testsuite

[13:30:03] <purplefox> if you could separate the simple changes about metrics docs, and metrics close we can merge that because its not controversial

[13:30:12] <purplefox> then we have two other main fixes:

[13:30:14] <temporal_> for instance, one in http client handler

[13:30:16] <purplefox> 1. register handler stuff

[13:30:22] <purplefox> 2. executeOnIO changes

[13:30:40] <temporal_> is that the executeOnIO block register an websocket hanshake decoder

[13:30:58] <temporal_> and rarely this registration happens after the websocket handshake reply of the server

[13:31:21] <purplefox> ok so can we look at this fixes one by one?

[13:31:30] <purplefox> it's all jumbled into a pile right now

[13:31:31] <temporal_> I mean rather it happens after the websocket reply of the server (not handhake)

[13:31:44] <temporal_> and therefore it's not decoded and lost

[13:31:59] <temporal_> I will do my best to separate them

[13:32:04] <temporal_> given they are sometimes coupled

[13:32:13] <temporal_> keep in mind tomorrow I will be travelling

[13:32:22] <temporal_> and I would like to finish this before

[13:32:23] <purplefox> or maybe add a comment or something explaining why you made a change

[13:32:31] <temporal_> yes it's not obvious

[13:32:49] <temporal_> my opinion is that executeOnIO block should only care about user api callbacks

[13:32:58] <temporal_> which are potentially blocking

[13:33:19] <temporal_> any internal state change in executeOnIo is subject to create a race condition

[13:33:30] <temporal_> specially if it depends on previous state

[13:33:43] <temporal_> like connectionMap update

[13:33:57] <temporal_> or netty websocket handler update

[13:39:50] <AlexLehm> purplefox: i wanted to mention, i have two prs for the core project open that are not related (at least directly) to the mail project, if somebody please could check that out

[13:40:29] <purplefox> ok

[13:41:03] <purplefox> AlexLehm: so.. a bit of feedback on the mail stuff

[13:41:20] <purplefox> i get some blocked thread warnings when running the test suite

[13:42:26] <purplefox> and it takes quite a long time to run (3+ minutes)

[13:42:59] <AlexLehm> yes i have not been able to figure out why this taking so long in some cases, the same test runs in about 5 seconds sometimes

[13:43:19] <AlexLehm> sometimes it takes up to 80 seconds

[13:43:46] <purplefox> it takes > 3 minutes for me

[13:44:13] <AlexLehm> for individual tests?

[13:44:20] <purplefox> for the test suite

[13:44:42] <purplefox> and I get stuff like:

[13:44:42] <purplefox> Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 2369 ms time 2000000000

[13:45:23] <purplefox> as you know we are approaching cut off for 3.0

[13:47:29] <purplefox> AlexLehm: one other thing, can you add more asserts in SMTPConnectionPool test?

[13:47:29] <AlexLehm> are you getting this when running the build locally?

[13:47:33] <purplefox> yes

[13:48:01] <AlexLehm> ok, it would be helpful if I could match your configuration to reproduce that

[13:48:14] <purplefox> how do you mean “my configuration” ?

[13:48:25] <purplefox> i just git cloned

[13:48:28] <purplefox> and mvn clean test

[13:49:25] <AlexLehm> i am either running the build on a windows or a virtualbox and I didn't get the blocked warning, maybe i have some environment set differently

[13:50:02] <AlexLehm> about the asserts, i can add more

[13:50:10] <purplefox> for example:

[13:50:22] <purplefox> pool.getConnection(conn → {

[13:50:22] <purplefox> pool.getConnection(conn2 → {

[13:50:23] <purplefox> testContext.assertNotEquals(conn, conn2);

[13:50:23] <purplefox> testContext.assertEquals(2, pool.connCount());

[13:50:23] <purplefox> conn.returnToPool();

[13:50:24] <purplefox> conn2.returnToPool();

[13:50:26] <purplefox> pool.close(v → async.complete());

[13:50:28] <purplefox> }, th → {

[13:50:30] <purplefox> log.info(th);

[13:50:32] <purplefox> testContext.fail(th);

[13:50:34] <purplefox> });

[13:50:36] <purplefox> }, th → {

[13:50:38] <purplefox> log.info(th);

[13:50:40] <purplefox> testContext.fail(th);

[13:50:42] <purplefox> });

[13:50:46] <purplefox> it would be better to assert the pool at all stages of the test

[13:50:56] <purplefox> and before and after connections are returned

[13:51:09] <AlexLehm> ok

[13:51:48] <temporal_> purplefox here is the PR with worker server/client changes https://github.com/eclipse/vert.x/pull/1043

[13:53:05] <AlexLehm> purplefox: could you please send me the output of env and java-version you have when you are running the build, maybe there is an issue that only happens with openjdk or so

[13:53:59] <purplefox> i am just using standard oracle jdk:

[13:54:00] <purplefox> pool.getConnection(conn → {

[13:54:00] <purplefox> pool.getConnection(conn2 → {

[13:54:00] <purplefox> testContext.assertNotEquals(conn, conn2);

[13:54:00] <purplefox> testContext.assertEquals(2, pool.connCount());

[13:54:01] <purplefox> conn.returnToPool();

[13:54:02] <purplefox> conn2.returnToPool();

[13:54:04] <purplefox> pool.close(v → async.complete());

[13:54:06] <purplefox> }, th → {

[13:54:08] <purplefox> log.info(th);

[13:54:10] <purplefox> testContext.fail(th);

[13:54:12] <purplefox> });

[13:54:16] <purplefox> }, th → {

[13:54:18] <purplefox> log.info(th);

[13:54:20] <purplefox> testContext.fail(th);

[13:54:22] <purplefox> });

[13:54:24] <purplefox> sec

[13:54:26] <purplefox> wrong paste

[13:54:28] <purplefox> java version “1.8.0_25”

[13:54:30] <purplefox> Java(TM) SE Runtime Environment (build 1.8.0_25-b17)

[13:54:32] <purplefox> Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)

[13:55:52] <AlexLehm> ok, thats about the same

[13:55:55] <AlexLehm> which Linux?

[13:56:32] <AlexLehm> i am using ubuntu with VirtualBox

[14:01:20] <purplefox> linux mint

[14:01:34] <purplefox> how many cores have you given your VM?

[14:02:18] <AlexLehm> 2 i think

[14:03:05] <AlexLehm> linux mint is based on ubuntu i think, that should be about the same

[14:04:00] <AlexLehm> ok, i will do some more tests

[14:06:25] <purplefox> temporal_: can we just go through your changes one by one, right now?

[14:06:30] <temporal_> yes

[14:06:34] <purplefox> i think it would be easier that way

[14:06:44] <AlexLehm> ok, i will come back this eveningn

[14:06:47] <purplefox> temporal_: ok, can you explain the first one to me?

[14:07:19] <temporal_> what do you call first one ?

[14:08:15] <purplefox> i don't mind which one you start with, take your pick :)

[14:08:25] <temporal_> in the latest PR ?

[14:08:57] <purplefox> i'm suggesting we go through the changes one by one

[14:08:57] <temporal_> in channelRead the thread is an eventLoop thread

[14:09:14] <purplefox> which one are you discussing here?

[14:09:28] <temporal_> ClientConnection

[14:09:42] <purplefox> ok

[14:09:53] <temporal_> the big executeFromIO block

[14:10:02] <temporal_> it modifies the netty stack

[14:10:13] <temporal_> and causes frame not to be decoded

[14:10:20] <temporal_> because the decoder is added asycnhronously

[14:13:37] <temporal_> explanation : the handshakeComplete(ctx, response);

[14:14:05] <temporal_> causes in netty a modification of the websocket decoder stack

[14:14:23] <temporal_> and when it is in a worker thread it happens too late

[14:14:26] <temporal_> sometimes

[14:14:28] <temporal_> rarely

[14:16:28] <purplefox> the trouble with this is the synchronized block has been moved, so when you executeOnIO you might have race conditions with the state

[14:16:35] <purplefox> so you'd need add more synchronized blocks

[14:16:58] <temporal_> yes I missed that obviously

[14:17:06] <temporal_> I need to remove the global block

[14:17:08] <purplefox> but that's getting ugly now

[14:17:16] <temporal_> and have smaller for each execute on IO

[14:17:29] <temporal_> what do you mean by ugly ?

[14:17:32] <temporal_> tangled ?

[14:17:42] <temporal_> (it's already quite tangled)

[14:20:20] <purplefox> temporal_: i'm not sure I fully understand the modification of the netty stack stuff, can you elaborate?

[14:20:37] <temporal_> the handshakeComplete(ctx, response);

[14:22:27] <temporal_> it calls handshaker.finishHandshake(channel, response);

[14:22:44] <temporal_> that does this

[14:22:45] <temporal_> p.replace(ctx.name(), “ws-decoder”, newWebsocketDecoder());

[14:22:58] <temporal_> in Netty WebSocketClientHandshaker

[14:23:19] <temporal_> in a worker thread, that might happen after receiving a websocket frame from the server

[14:23:49] <temporal_> (when the server sends a message on connect)

[14:24:00] <temporal_> so when netty receives the websocket frame it is not decoded

[14:24:11] <temporal_> and the client websocket does not call the websocket handler

[14:29:20] <purplefox> temporal_: this is a very thorny area. I think it needs more thought. But this doesn't seem to be related to metrics, so I suggest we get the metrics stuff in and consider this later

[14:29:33] <temporal_> ok but

[14:29:45] <temporal_> the metrics test when they execute trigger this :-)

[14:29:52] <temporal_> which cause the test suite to hang erratically

[14:29:55] <temporal_> and make you scream :-)

[14:30:12] <temporal_> so we have a deadlock here :-)

[14:30:12] <purplefox> so lets ignore those tests for now, and add an isssue to address the underlying issue

[14:30:21] <temporal_> ok

[14:30:29] <temporal_> so let me do another PR only for the metrics stuff

[14:33:16] <temporal_> shall I remove the eventbus registration context changes in this PR ?

[14:34:04] <purplefox> yes please as I think that is a separate issue too

[14:35:16] <temporal_> ok

[15:57:20] <temporal_> purplefox here is the PR for metrics https://github.com/eclipse/vert.x/pull/1044

[17:14:48] <diega> Hi ppl, is there some flag to set in maven to disable doc generation in the vertx-redis-client project?

[17:15:17] <purplefox> diega: -DskipDocs

[17:16:17] <diega> purplefox, well, that make perfect sens :P. Thanks a lot

[17:16:46] <purplefox> diega: np!

[17:29:43] <aesteve> hi everyone

[17:30:25] <aesteve> purplefox: just to inform you : I think I faced a bug in redis-client (quite important)

[17:31:33] <aesteve> I know the final release is quite close, maybe it's concerning ? https://github.com/vert-x3/vertx-redis-client/issues/8 just so that you don't miss it

[17:40:58] <purplefox> aesteve: thanks. I'll add to the big pile of other stuff we need to look at too :)

[17:49:14] <aesteve> np, good luck with all that stuff !

[17:50:24] <aesteve> also, I made progress on the Server-Sent-Event stuff. I'll keep my repo up-to-date, with tests and everything ready to be integrated into apex (right package names, …) once 3.0 is out

[17:54:41] <stephane_bastian> purplefox: ust to keep you in the loop. I've used the latest Auth API and made some progress to authenticate using Facebook. However I hit a wall and had to modify the AuthProvider API again for this to work.

[17:56:56] <stephane_bastian> purplefox: I am making sure that everything is working before reporting back. Just wanted to share that we may have to slightly tweak the Auth API to be able to fulfil more advanced use cases

[18:03:18] <D-Spair> VirtualJUG session is about to start: http://virtualjug.com/

[18:09:16] <stephane_bastian> Does anyone know why each time I work on a branch in vertx-web, I am always getting files modified (not by me..) in the folder src/test/sockjs-protocol/venv/lib/python2.7/ ?

[18:27:51] <aesteve> stephane_bastian: it must be when you run the tests, no ?

[18:28:28] <stephane_bastian> aesteve: yes, when I run mvn clean package for instance

[18:28:35] <stephane_bastian> do you have the same issue?

[18:29:09] <aesteve> I haven't worked with vertx-web so far

[18:29:32] <aesteve> but I guess there's a sockJS client implementation in Python which is used to test Apex's sockJSHandler

[18:29:35] <stephane_bastian> well it was the same with apex

[18:29:46] <stephane_bastian> ok. thx

[18:38:38] <aesteve> mmh I have an understanding issue. If client and server use a chunked connection and the client wants to close the connection. Which method on which object should he call ?

[18:40:06] <aesteve> I mean, there's HttpServerResponse.close for the server to close the connection. But if the HttpClient wants to disconnect ?

[18:40:49] <aesteve> simply HttpClient.close ?

[21:32:28] <AlexLehm> purplefox, i have set up a linux mint and i am still getting a build time of about 35 seconds

[21:42:27] <AlexLehm> sorry, build the wrong branch

[21:43:07] <AlexLehm> ok i can reproduce the issue with this, some tests fail due to a timeout which means they take more than two minutes