Closed Bug 1528317 Opened 5 years ago Closed 5 years ago

Increase in "Network Protocol Error" reports in Firefox 65

Categories

(Core :: Networking, defect, P1)

65 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox-esr60 --- unaffected
firefox65 + wontfix
firefox66 + fixed
firefox67 + fixed

People

(Reporter: philipp, Assigned: dragana, NeedInfo)

References

Details

(Keywords: regression, Whiteboard: [necko-triaged])

Attachments

(3 files)

after the update to firefox 65, there are a number of reports on various support channels from people sporadically receiving the following error messages when attempting to connect to popular sites like google search, yahoo mail and others:

Network Protocol Error

Firefox has experienced a network protocol violation that cannot be repaired.

The page you are trying to view cannot be shown because an error in the network protocol was detected.

Please contact the web site owners to inform them of this problem.

so far i couldn't spot any similarities (same isp or security software) in the configuration of affected users and am unsure how to debug/diagnose this further...

can anyone of them get a http log? or is it happening randomly?

Flags: needinfo?(madperson)

from the descriptions it appears to be a fairly intermittent problem, but i can ask around if someone is able to come up with a log anyway...

this is probably http 2 connection and it means we are getting something that is not by the spec or we have a bug and interpret something that is by the spec as being wrong.

The new thing in 65 is websockets over http2 and te:trailers header. Websockets over http2 can be turned off by a pref, so if somebody is getting the error frequently he/she can test turning of websockets. There is not pref to turn off the te trailers header.

Log of error occurring on Firefox 65.0 OSX 10.11.6 while trying to archive a page on archive.org. Note the asterisk in the generated archive.org URL...

Not sure what I was looking for - turned off network.https.spdy.websockets & tested, turned on network.websocket.allowInsecureFromHTTPS and network.websocket.autofollow-http-redirects & tested. Same result (error page) in both cases.

Same error trying to browse archive.org by entering a URL into the search box here: https://archive.org/web/web.php

clearing ni - a log was provided through a private channel and here in the bug as well, thanks to juan!
did they provide any insight into the cause of the problem?

Flags: needinfo?(madperson)

(In reply to Juan Crstián from comment #5)

Created attachment 9044517 [details]
mozlog-20190217-ff65-0-osx-10-11-6.log-main.501

Log of error occurring on Firefox 65.0 OSX 10.11.6 while trying to archive a page on archive.org. Note the asterisk in the generated archive.org URL...

We have a NS_ERROR_NET_HTTP2_SENT_GOAWAY (0x804B0053)
We showed "Network Protocol Error" from 65 since bug 1050329 introduced a new error message.
i.e., it's not a behavior regression

More deeper, looks like the error starts from PR_WOULD_BLOCK_ERROR (NS_BASE_STREAM_WOULD_BLOCK)
and later we receive a GOAWAY from server.

Similar pattern in Comment 7

What do you think about the log in Comment 5 and Comment 7, dragana?

2019-02-17 20:38:39.220020 UTC - [Parent 501: Socket Thread]: D/nsSocketTransport nsSocketInputStream::Read [this=0x11dfad688 count=9]
2019-02-17 20:38:39.220024 UTC - [Parent 501: Socket Thread]: D/nsSocketTransport   calling PR_Read [count=9]
2019-02-17 20:38:39.220030 UTC - [Parent 501: Socket Thread]: D/nsSocketTransport   PR_Read returned [n=-1]
2019-02-17 20:38:39.220034 UTC - [Parent 501: Socket Thread]: I/nsHttp Http2Session 0x11cb29800 buffering frame header read failure 80470007

...

2019-02-17 20:38:39.834599 UTC - [Parent 501: Socket Thread]: V/nsHttp nsHttpConnection::CloseTransaction[this=0x11dfb0400 trans=0x11cb29800 reason=80470002]
2019-02-17 20:38:39.834604 UTC - [Parent 501: Socket Thread]: V/nsHttp nsHttpConnection::DontReuse 0x11dfb0400 spdysession=0x11cb29800
2019-02-17 20:38:39.834608 UTC - [Parent 501: Socket Thread]: I/nsHttp Http2Session::DontReuse 0x11cb29800
2019-02-17 20:38:39.834613 UTC - [Parent 501: Socket Thread]: I/nsHttp Http2Session::Close 0x11cb29800 0
2019-02-17 20:38:39.834618 UTC - [Parent 501: Socket Thread]: I/nsHttp Http2Session::CloseStream 0x11cb29800 0x112d0f320 0xf 804B0053
2019-02-17 20:38:39.834648 UTC - [Parent 501: Socket Thread]: V/nsHttp nsHttpTransaction::Close [this=0x11e051800 reason=804b0053]
2019-02-17 20:38:39.834657 UTC - [Parent 501: Socket Thread]: V/nsHttp nsHttpConnectionMgr::RemoveActiveTransaction t=0x11e051800 tabid=100000361(1) thr=0
2019-02-17 20:38:39.834709 UTC - [Parent 501: Socket Thread]: V/nsHttp nsHttpTransaction::RemoveDispatchedAsBlocking this=0x11e051800 not blocking
2019-02-17 20:38:39.834719 UTC - [Parent 501: Socket Thread]: V/nsHttp nsHttpTransaction 0x11e051800 request context set to null in ReleaseBlockingTransaction() - was 0x1c62ee900
2019-02-17 20:38:39.834917 UTC - [Parent 501: Socket Thread]: I/nsHttp Http2Stream::~Http2Stream 0x112d0f320
2019-02-17 20:38:39.835000 UTC - [Parent 501: Main Thread]: D/nsHttp nsHttpChannel::OnStartRequest [this=0x1150de000 request=0x1126781f0 status=804b0053]
Flags: needinfo?(dd.mozilla)

Too late to do anything about this for 65, but tracking this for later releases so it stays on our radar while investigation continues.

I'm one of the users who face this issue (tens of times a day, randomly.) I'll be happy to help you debug this. Let me know what log files etc you need.

My system config is:

Operating System: Manjaro Linux
KDE Plasma Version: 5.15.1
KDE Frameworks Version: 5.55.0
Qt Version: 5.12.1
Kernel Version: 4.19.24-1-MANJARO
OS Type: 64-bit
Processors: 4 × Intel® Core™ i7-7500U CPU @ 2.70GHz
Memory: 7.6 GiB of RAM

Bug 1050329 has introduced the 'Network Protocol Error' error message. But we are show it more often than we should probably.

for example if network disconnect we will show this error, which is not really correct, the network connection broke there is no protocol error.

the log files show a bit different case. we get GOAWAY from a server(web.archive.org) with error ENHANCE_YOUR_CALM, but we do not exceed stream count limit. The server sets that limit to 10 which is very very low. We use 7 streams for the priority tree(we never send request on them). (for example rfc recommend 100).

Flags: needinfo?(dd.mozilla)

(In reply to dufferzafar0 from comment #11)

I'm one of the users who face this issue (tens of times a day, randomly.) I'll be happy to help you debug this. Let me know what log files etc you need.

My system config is:

Operating System: Manjaro Linux
KDE Plasma Version: 5.15.1
KDE Frameworks Version: 5.55.0
Qt Version: 5.12.1
Kernel Version: 4.19.24-1-MANJARO
OS Type: 64-bit
Processors: 4 × Intel® Core™ i7-7500U CPU @ 2.70GHz
Memory: 7.6 GiB of RAM

do you know on which sites you notice this? I want to know if we have more cases similar to web.archive.org.

In bug 1050329 we have introduced a new error and an error page to show that there is a h2 protocol error. The problem is that we show that error when we close a h2 connection for any reason, like a network timeout. Also Http2Session::SessionError does not return NS_ERROR_ILLEGAL_VALUE error but NS_ERROR_NET_HTTP2_SENT_GOAWAY.

Assignee: nobody → dd.mozilla
Priority: -- → P1
Whiteboard: [necko-triaged]
Keywords: checkin-needed

Comment on attachment 9049218 [details]
Bug 1528317 - Return NS_ERROR_NET_HTTP2_SENT_GOAWAY only when we have a protocol error.

Beta/Release Uplift Approval Request

  • Feature/Bug causing the regression: Bug 1050329
  • User impact if declined: We are showing "Network Protocol Error " too often (for errors that are not Http2 protocol error). The biggest problem is that we show the error also when network connection breaks. (Before bug 1050329 we were not showing any error page)
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: none
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch only changes error codes we return (instead of error code NS_ERROR_NET_HTTP2_SENT_GOAWAY which will show error page we return NS_ERROR_ABORT which does not show any error page).
  • String changes made/needed:
Attachment #9049218 - Flags: approval-mozilla-beta?

Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/3e844ef9583b
Return NS_ERROR_NET_HTTP2_SENT_GOAWAY only when we have a protocol error. r=valentin

Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67

Comment on attachment 9049218 [details]
Bug 1528317 - Return NS_ERROR_NET_HTTP2_SENT_GOAWAY only when we have a protocol error.

Fix for a recent regression, verified in nightly;
OK for uplift for the 66 RC.

Attachment #9049218 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9049218 [details]
Bug 1528317 - Return NS_ERROR_NET_HTTP2_SENT_GOAWAY only when we have a protocol error.

On a second look, holding back on this for a bit.

Attachment #9049218 - Flags: approval-mozilla-beta+ → approval-mozilla-beta?

Dragana, can you double check something, I just tested the issue I was having in bug 1530108 with this change, and rather than seeing a network protocol error, the Wayback Machine page just has no response at all.

Flags: needinfo?(dd.mozilla)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

can you make a http log for me I cannot reproduce the issue.

Http logging:
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

i looked at you log from bug 1530108 but the log is not complete, there is no http2 session log.

you can also send me the log via email.

Flags: needinfo?(dd.mozilla)

OK, sent you a log of the issue happening on two different pages, the main archive.org (where you enter a url into Wayback Machine) and also https://archive.org/web/.

Flags: needinfo?(dd.mozilla)

(In reply to Liz Henry (:lizzard) (use needinfo) from comment #24)

OK, sent you a log of the issue happening on two different pages, the main archive.org (where you enter a url into Wayback Machine) and also https://archive.org/web/.

I looked at the log. The server is closing connection without sanding a response.

I will open a tech-evangelist bug for the issue described in comment 13.

Flags: needinfo?(dd.mozilla)

Comment on attachment 9049218 [details]
Bug 1528317 - Return NS_ERROR_NET_HTTP2_SENT_GOAWAY only when we have a protocol error.

OK for uplift to m-r for the 66 RC.

Attachment #9049218 - Flags: approval-mozilla-beta? → approval-mozilla-release+

I'm emailing a friend at archive.org to ask how to properly report the issue to them.

I think they are aware of the issue now, but it still remains broken when I tried with 68 nightly this morning.

Wayback Machine/archive.org is still broken in 68. I'll try reaching out again.

Dragana, did you open a new tech evangelism bug you mentioned in comment 25?

Flags: needinfo?(dd.mozilla)

(In reply to Liz Henry (:lizzard) (use needinfo) from comment #31)

Dragana, did you open a new tech evangelism bug you mentioned in comment 25?

yes bug 1534344

Flags: needinfo?(dd.mozilla)
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED

I'm seeing this error in Firefox 69 on one of my clients' websites. Tested on a clean profile.

Strangely enough, it does work sometimes, most of the time though it doesn't.

This is my logfile: https://pastebin.com/K1GF1SKg

Flags: needinfo?(dd.mozilla)

(In reply to kuba.serafinowski from comment #33)

I'm seeing this error in Firefox 69 on one of my clients' websites. Tested on a clean profile.

Strangely enough, it does work sometimes, most of the time though it doesn't.

This is my logfile: https://pastebin.com/K1GF1SKg

from thee log above, I see that push ID is wrong:
2019-09-05 08:07:19.012000 UTC - [Parent 14064: Socket Thread]: I/nsHttp Http2Session::RecvPushPromise 0000016C8D7DD800 ID too low 2 expected > 2.
2019-09-05 08:07:19.012000 UTC - [Parent 14064: Socket Thread]: I/nsHttp Http2Session::SessionError 0000016C8D7DD800 reason=0x1 mPeerGoAwayReason=0x1f

we have already receive a pushPromise with ID 2. This is a a bug on the server.

Flags: needinfo?(dd.mozilla)

(In reply to kuba.serafinowski from comment #33)

… Strangely enough, it does work sometimes, most of the time though it doesn't. …

Might the working cases be private windows?

Compare with bug 1530108 comment 17 and https://redd.it/ejy2kv

Flags: needinfo?(kuba.serafinowski)

(In reply to Graham Perrin from comment #35)

(In reply to kuba.serafinowski from comment #33)

… Strangely enough, it does work sometimes, most of the time though it doesn't. …

Might the working cases be private windows?

Compare with bug 1530108 comment 17 and https://redd.it/ejy2kv

Actually, yes — at the moment Firefox is throwing these errors when I try to visit any Twitter URL (even https://twitter.com/) — but not in private windows.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: