OMG: Sourcegraph (cs.tvl.fyi) is down
#408
Opened by aspen at
Currently returning a 502 Bad Gateway
I ran
sudo systemctl restart sourcegraph
and it's backLooking at the logs before I restarted it, I see:
10:11:36 postgres | 2024-07-05 10:11:36.267 UTC [44] PANIC: could not write to log file 0000000100000003000000E6 at offset 1359872, length 8192: No space left on device 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [25] LOG: WAL writer process (PID 44) was terminated by signal 6: Aborted 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [25] LOG: terminating any other active server processes 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155777] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155777] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155777] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155778] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155778] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155778] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155755] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155755] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155755] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155627] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155627] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155627] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155763] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155763] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155763] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155487] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155487] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155487] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155656] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155656] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155656] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155616] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155616] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155616] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155531] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155531] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155531] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155179] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155179] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155179] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155382] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155382] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155382] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155305] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155305] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155305] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155756] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155756] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155756] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154906] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154906] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154906] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155131] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155131] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [155131] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154874] WARNING: terminating connection because of crash of another server process 10:11:36 repo-updater | t=2024-07-05T10:11:36+0000 lvl=eror msg="Repository authz config was invalid (errors are visible in the UI as an admin user, you should fix ASAP). Restricting access to repositories by default for now to be safe." seriousProblems="[Could not list external services: unexpected EOF]" 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154874] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 repo-updater | t=2024-07-05T10:11:36+0000 lvl=eror msg="Failed to compute schedule" err="schedule users with outdated permissions: unexpected EOF" 10:11:36 precise-code-intel-worker | t=2024-07-05T10:11:36+0000 lvl=eror msg="Repository authz config was invalid (errors are visible in the UI as an admin user, you should fix ASAP). Restricting access to repositories by default for now to be safe." seriousProblems="[Could not list external services: unexpected EOF]" 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154874] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154855] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154855] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154855] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154785] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154785] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154785] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154741] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154741] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154741] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154949] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154949] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154949] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154171] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154171] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [154171] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [152701] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [152701] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [152701] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [879] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [879] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [879] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [150575] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [150575] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.357 UTC [150575] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [151497] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [151497] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [151497] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [131512] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [131512] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [131512] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [45] WARNING: terminating connection because of crash of another server process 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [45] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [45] HINT: In a moment you should be able to reconnect to the database and repeat your command. 10:11:36 postgres | 2024-07-05 10:11:36.358 UTC [46] LOG: could not open temporary statistics file "pg_stat/global.tmp": No space left on device 10:11:36 postgres | 2024-07-05 10:11:36.362 UTC [155788] LOG: PID 155755 in cancel request did not match any process 10:11:36 postgres | 2024-07-05 10:11:36.366 UTC [155789] LOG: PID 154171 in cancel request did not match any process 10:11:36 postgres | 2024-07-05 10:11:36.366 UTC [25] LOG: all server processes terminated; reinitializing 10:11:36 postgres | 2024-07-05 10:11:36.386 UTC [25] LOG: could not write to file "postmaster.pid": No space left on device 10:11:36 postgres | 2024-07-05 10:11:36.393 UTC [25] FATAL: could not access status of transaction 0 10:11:36 postgres | 2024-07-05 10:11:36.393 UTC [25] DETAIL: Could not open file "pg_notify/0000": No space left on device. 10:11:36 postgres | 2024-07-05 10:11:36.396 UTC [25] LOG: database system is shut down 10:11:36 postgres | Terminating postgres postgres died. Shutting down... 10:11:36 worker | {"SeverityText":"INFO","Timestamp":1720174296401469234,"InstrumentationScope":"worker.batches-bulk-processor.routines","Caller":"workerutil/worker.go:220","Function":"github.com/sourcegraph/sourcegraph/internal/workerutil.(*Worker).Start","Body":"Shutting down dequeue loop","Resource":{"service.name":"worker","service.version":"3.40.0","service.instance.id":"940a0fc9-fa3e-4fcc-ad9b-b83f8d8d76c1"},"Attributes":{"name":"batches_bulk_processor","reason":""}} 10:11:36 worker | {"SeverityText":"INFO","Timestamp":1720174296417798155,"InstrumentationScope":"worker.codeintel-auto-indexing.routines","Caller":"workerutil/worker.go:220","Function":"github.com/sourcegraph/sourcegraph/internal/workerutil.(*Worker).Start","Body":"Shutting down dequeue loop","Resource":{"service.name":"worker","service.version":"3.40.0","service.instance.id":"940a0fc9-fa3e-4fcc-ad9b-b83f8d8d76c1"},"Attributes":{"name":"precise_code_intel_dependency_sync_scheduler_worker","reason":""}} 10:11:36 worker | {"SeverityText":"INFO","Timestamp":1720174296417876212,"InstrumentationScope":"worker.codeintel-auto-indexing.routines","Caller":"workerutil/worker.go:220","Function":"github.com/sourcegraph/sourcegraph/internal/workerutil.(*Worker).Start","Body":"Shutting down dequeue loop","Resource":{"service.name":"worker","service.version":"3.40.0","service.instance.id":"940a0fc9-fa3e-4fcc-ad9b-b83f8d8d76c1"},"Attributes":{"name":"precise_code_intel_dependency_indexing_scheduler_worker","reason":""}} 10:11:36 gitserver | t=2024-07-05T10:11:36+0000 lvl=warn msg="Setting clone status in DB" error="setting clone status: unexpected EOF" 10:11:36 worker | {"SeverityText":"INFO","Timestamp":1720174296418975081,"InstrumentationScope":"worker.batches-workspace-resolver.routines","Caller":"workerutil/worker.go:220","Function":"github.com/sourcegraph/sourcegraph/internal/workerutil.(*Worker).Start","Body":"Shutting down dequeue loop","Resource":{"service.name":"worker","service.version":"3.40.0","service.instance.id":"940a0fc9-fa3e-4fcc-ad9b-b83f8d8d76c1"},"Attributes":{"name":"batch_changes_batch_spec_resolution_worker","reason":""}} 10:11:36 worker | {"SeverityText":"INFO","Timestamp":1720174296425656514,"InstrumentationScope":"triggers","Caller":"workerutil/worker.go:220","Function":"github.com/sourcegraph/sourcegraph/internal/workerutil.(*Worker).Start","Body":"Shutting down dequeue loop","Resource":{"service.name":"worker","service.version":"3.40.0","service.instance.id":"940a0fc9-fa3e-4fcc-ad9b-b83f8d8d76c1"},"Attributes":{"name":"code_monitors_trigger_jobs_worker","reason":""}} 10:11:36 worker | {"SeverityText":"INFO","Timestamp":1720174296425650191,"InstrumentationScope":"worker.batches-reconciler.routines","Caller":"workerutil/worker.go:220","Function":"github.com/sourcegraph/sourcegraph/internal/workerutil.(*Worker).Start","Body":"Shutting down dequeue loop","Resource":{"service.name":"worker","service.version":"3.40.0","service.instance.id":"940a0fc9-fa3e-4fcc-ad9b-b83f8d8d76c1"},"Attributes":{"name":"batches_reconciler_worker","reason":""}} 10:11:36 worker | {"SeverityText":"INFO","Timestamp":1720174296425702536,"InstrumentationScope":"actions","Caller":"workerutil/worker.go:220","Function":"github.com/sourcegraph/sourcegraph/internal/workerutil.(*Worker).Start","Body":"Shutting down dequeue loop","Resource":{"service.name":"worker","service.version":"3.40.0","service.instance.id":"940a0fc9-fa3e-4fcc-ad9b-b83f8d8d76c1"},"Attributes":{"name":"code_monitors_action_jobs_worker","reason":""}} 10:11:36 gitserver | t=2024-07-05T10:11:36+0000 lvl=eror msg="Failed to fetch remote info" repo=github.com/nixos/nixpkgs error="signal: interrupt" output= 10:11:36 gitserver | t=2024-07-05T10:11:36+0000 lvl=eror msg="Failed to ensure HEAD exists" repo=github.com/nixos/nixpkgs error="failed to fetch remote info: signal: interrupt" 10:11:36 gitserver | t=2024-07-05T10:11:36+0000 lvl=warn msg="Setting clone status in DB" error="setting clone status: failed to connect to `host=127.0.0.1 user=postgres database=sourcegraph`: dial error (dial tcp 127.0.0.1:5432: connect: connection refused)" 10:11:36 zoekt-webserver | 2024/07/05 10:11:36 shutting down 10:11:36 gitserver | t=2024-07-05T10:11:36+0000 lvl=warn msg="Setting last error in DB" error="setting last error: failed to connect to `host=127.0.0.1 user=postgres database=sourcegraph`: dial error (dial tcp 127.0.0.1:5432: connect: connection refused)" 10:11:36 gitserver | t=2024-07-05T10:11:36+0000 lvl=warn msg="error cloning repo" repo=depot err="failed to clone depot: clone failed. Output: : signal: interrupt" 10:11:36 gitserver | {"SeverityText":"WARN","Timestamp":1720174296433622096,"InstrumentationScope":"http","Caller":"trace/httptrace.go:261","Function":"github.com/sourcegraph/sourcegraph/internal/trace.HTTPMiddleware.func1","Body":"slow http request","Resource":{"service.name":"gitserver","service.version":"3.40.0","service.instance.id":"127.0.0.1:3178"},"Attributes":{"route_name":"unknown","method":"POST","url":"/repo-update","code":200,"duration":22.576419607}} 10:11:36 postgres_exporter | Terminating postgres_exporter 10:11:36 repo-updater | t=2024-07-05T10:11:36+0000 lvl=eror msg="runUpdateLoop: error updating repo" uri=depot err="failed to clone depot: clone failed. Output: : signal: interrupt" 10:11:36 frontend | t=2024-07-05T10:11:36+0000 lvl=eror msg="config: failed to read config from database(2), trying again in 1s (read error)" error="ConfStore.SiteGetLatest: unexpected EOF" 10:11:36 precise-code-intel-worker | {"SeverityText":"INFO","Timestamp":1720174296448050274,"InstrumentationScope":"worker","Caller":"workerutil/worker.go:220","Function":"github.com/sourcegraph/sourcegraph/internal/workerutil.(*Worker).Start","Body":"Shutting down dequeue loop","Resource":{"service.name":"precise-code-intel-worker","service.version":"3.40.0","service.instance.id":"0d337c8349c6"},"Attributes":{"name":"precise_code_intel_upload_worker","reason":""}} 10:11:36 worker | t=2024-07-05T10:11:36+0000 lvl=eror msg="Repository authz config was invalid (errors are visible in the UI as an admin user, you should fix ASAP). Restricting access to repositories by default for now to be safe." seriousProblems="[Could not list external services: unexpected EOF]" 10:11:36 prometheus | Terminating prometheus 10:11:36 github-proxy | Terminating github-proxy 10:11:36 zoekt-indexserver | Terminating zoekt-indexserver 10:11:36 symbols | Terminating symbols 10:11:36 repo-updater | Terminating repo-updater 10:11:36 grafana | Terminating grafana 10:11:36 precise-code-intel-worker | Terminating precise-code-intel-worker 10:11:36 zoekt-webserver | Terminating zoekt-webserver 10:11:36 worker | Terminating worker 10:11:36 frontend | Terminating frontend 10:11:36 jaeger | Terminating jaeger 10:11:36 syntect_server | Terminating syntect_server 10:11:36 searcher | Terminating searcher 10:11:36 nginx | Terminating nginx 10:11:36 minio | Terminating minio 10:11:44 gitserver | t=2024-07-05T10:11:44+0000 lvl=eror msg="retrying HTTP request failed" attempt=20 method=POST url=http://127.0.0.1:3090/.internal/configuration status=0 err="dial tcp 127.0.0.1:3090: connect: connection refused" 10:11:46 gitserver | Terminating gitserver
aspen at 2024-07-06T13·24+00
that was bad. hope that doesn't happen again I guess.
aspen at 2024-07-06T13·26+00
- aspen closed this issue at 2024-07-06T13·26+00