Commit 5f2a2481 authored by Matteo Melli's avatar Matteo Melli

Merge branch 'avoid-system-exit' into 'master'

Improvements

See merge request ongresinc/pgio!4
parents 3d74a2fb c9ff1c09
# pgio
A Java CLI turned in executable using GraalVM that is able to capture disk IO
usage stats per process (that can be grouped by process type) and total of the system.
The tool produce a stream of data that can be
stored and interpreted (using CSV by default or exporting to Prometheus) to
analyze and find out which part of PostgreSQL is using the disk most during a
period of time.
usage stats per process group (default groups are based on common postgresql
installation) and total of the system.
The tool produce a stream of data that can be stored and interpreted (using
CSV by default or exporting to Prometheus) to analyze and find out which part
of PostgreSQL is using the disk most during a period of time.
## Stats collected
pgio runs only on Linux modern kernels (>= 2.6.x) reading `/proc/vmstat` and
`/proc/<pid>/(cmdline|stat|io)`
(http://man7.org/linux/man-pages/man5/proc.5.html)
(see [here](http://man7.org/linux/man-pages/man5/proc.5.html))
Taking iotop as a reference (see https://unix.stackexchange.com/questions/248197/iotop-showing-1-5-mb-s-of-disk-write-but-all-programs-have-0-00-b-s/248218#248218):
Taking iotop as a reference (see [here](https://unix.stackexchange.com/questions/248197/iotop-showing-1-5-mb-s-of-disk-write-but-all-programs-have-0-00-b-s/248218#248218)):
iotop read information per process from `/proc/<pid>/io`, in particular:
......@@ -43,11 +43,10 @@ pgpgout – Number of kilobytes the system has paged out to disk per second.
Combining per process info with global info iotop calculate the % of write/read
throughput each process is consuming.
Referenced iotop is a python rewrite of original iotop
(https://github.com/analogue/iotop).
Referenced iotop is a python rewrite of original [iotop](https://github.com/analogue/iotop).
Original iotop code (https://github.com/Tomas-M/iotop) uses Taskstats
(https://www.kernel.org/doc/Documentation/accounting/taskstats.txt).
Original [iotop code](https://github.com/Tomas-M/iotop) uses
[Taskstats](https://www.kernel.org/doc/Documentation/accounting/taskstats.txt).
Seems that stats used from taskstats calls includes following stats from
`/proc/<pid>/io`:
......@@ -89,17 +88,18 @@ To build the project as a tar.gz with all Java dependencies:
mvn clean package -P assembler
```
To build the project as a tar.gz with an executable built using GraalVM
To build the project as a tar.gz with an executable built using [GraalVM](https://www.graalvm.org/)
`native-image` tool:
```
export GRAALVM_HOME=<path to graalvm home>
mvn clean package -P executable
```
## Run
This will generate output with collected stats of all processes (should be run
as root) every 3 seconds:
This will generate output with collected stats of grouped processes (should be run
as user with sufficient privileges like postgres) every 3 seconds:
```
bin/pgio -D <postgresql data dir>
......@@ -134,34 +134,35 @@ Option Description
### Group configuration file
Specifing this file will enable a special mode where instead of single processes
groups with aggregated data will be printed.
Specifying this file will allow to change default groups configuration.
A JSON indicating groups has the following syntax (regular expression will use
pattern from [java.util.regex.Pattern](https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html):
```
{
"<group name>" [ <list of regexp> ],
[
{ "<group name>": [ <list of regexp> ] }
...
}
]
```
For example to use with PostgreSQL we create a template using following `postgresql.json`:
Default group configuration is:
```
{
"archiver": [ ".*archiver.*" ],
"wal sender": [ ".*wal sender.*" ],
"bgwriter": [ ".*writer process.*" ],
"autovacuum": [ ".*autovacuum worker process.*" ],
"stats": [ ".*stats collector process.*" ],
"wal writer": [ ".*wal writer process.*" ],
"checkpoint": [ ".*checkpointer process.*" ],
"query": [ "postgres: " ],
}
[
{ "archiver": [ ".*archiver.*" ] },
{ "wal sender": [ ".*wal sender.*" ] },
{ "bgwriter": [ ".*writer process.*" ] },
{ "autovacuum": [ ".*autovacuum worker process.*" ] },
{ "stats": [ ".*stats collector process.*" ] },
{ "wal writer": [ ".*wal writer process.*" ] },
{ "checkpoint": [ ".*checkpointer process.*" ] },
{ "query": [ "postgres: " ] }
]
```
If configuration is save in file `postgresql.json` it can be used as follow:
```
bin/pgio -D <postgresql data dir> --advanced --group postgresql.json
```
......@@ -179,8 +180,6 @@ The expected output will include following groups:
The idea is that you can extend those groups to include other relevant
tools like wal-e, pgbouncer, particual user's queries.
Grouping stats by process type reduce the number of stats collected
and provide more understandable metrics.
## Prometheus service
......@@ -189,3 +188,58 @@ To start pgio as a prometheus service:
```
bin/pgio -D <postgresql data dir> --advanced --prometheus-service --prometheus-bind 0.0.0.0 --prometheus-port 9090
```
### CSV output example
```
$ pgio
timestamp,pid,ppid,label,rchar,wchar,read_bytes,write_bytes,cancelled_write_bytes
2018-12-19T14:42:11.070Z,"archiver",,,0,0,0,0,0
2018-12-19T14:42:11.070Z,"wal sender",,,0,0,0,0,0
2018-12-19T14:42:11.070Z,"bgwriter",,,0,0,0,0,0
2018-12-19T14:42:11.070Z,"autovacuum",,,0,0,0,0,0
2018-12-19T14:42:11.070Z,"stats",,,0,0,0,0,0
2018-12-19T14:42:11.070Z,"wal writer",,,0,0,0,0,0
2018-12-19T14:42:11.069Z,"checkpoint",,,0,0,0,0,0
2018-12-19T14:42:11.070Z,"other",,,0,0,0,0,0
2018-12-19T14:42:14.017Z,"archiver",,,0,0,0,0,0
2018-12-19T14:42:14.017Z,"wal sender",,,0,0,0,0,0
2018-12-19T14:42:14.016Z,"bgwriter",,,1,50061313,73728,50061312,0
2018-12-19T14:42:14.017Z,"autovacuum",,,0,0,0,0,0
2018-12-19T14:42:14.017Z,"stats",,,0,0,0,0,0
2018-12-19T14:42:14.016Z,"wal writer",,,0,50061312,73728,50061312,0
2018-12-19T14:42:14.016Z,"checkpoint",,,0,0,0,0,0
2018-12-19T14:42:14.017Z,"other",,,42409984,342833920,30908416,249819136,0
2018-12-19T14:42:17.017Z,"archiver",,,0,0,0,0,0
2018-12-19T14:42:17.017Z,"wal sender",,,0,0,0,0,0
2018-12-19T14:42:17.016Z,"bgwriter",,,0,75595776,135168,75595776,0
2018-12-19T14:42:17.017Z,"autovacuum",,,0,0,0,0,0
2018-12-19T14:42:17.016Z,"stats",,,0,0,0,0,0
2018-12-19T14:42:17.016Z,"wal writer",,,0,75595776,135168,75595776,0
2018-12-19T14:42:17.016Z,"checkpoint",,,0,0,0,0,0
2018-12-19T14:42:17.016Z,"other",,,44113920,359063552,45285376,244834304,0
2018-12-19T14:42:20.019Z,"archiver",,,0,0,0,0,0
2018-12-19T14:42:20.019Z,"wal sender",,,0,0,0,0,0
2018-12-19T14:42:20.019Z,"bgwriter",,,0,98852864,0,98852864,0
2018-12-19T14:42:20.020Z,"autovacuum",,,0,0,0,0,0
2018-12-19T14:42:20.019Z,"stats",,,0,0,0,0,0
2018-12-19T14:42:20.019Z,"wal writer",,,0,98852864,0,98852864,0
2018-12-19T14:42:20.018Z,"checkpoint",,,0,221370,299008,225280,0
2018-12-19T14:42:20.019Z,"other",,,44843008,335036416,45166592,235864064,0
2018-12-19T14:42:23.019Z,"archiver",,,0,0,0,0,0
2018-12-19T14:42:23.019Z,"wal sender",,,0,0,0,0,0
2018-12-19T14:42:23.018Z,"bgwriter",,,1,9003009,0,9003008,0
2018-12-19T14:42:23.020Z,"autovacuum",,,0,0,0,0,0
2018-12-19T14:42:23.019Z,"stats",,,0,0,4096,0,0
2018-12-19T14:42:23.018Z,"wal writer",,,1,9003009,0,9003008,0
2018-12-19T14:42:23.018Z,"checkpoint",,,1,16670992,409600,16678912,0
2018-12-19T14:42:23.019Z,"other",,,8633088,75464765,8630272,58880000,0
2018-12-19T14:42:26.023Z,"archiver",,,0,0,0,0,0
2018-12-19T14:42:26.023Z,"wal sender",,,0,0,0,0,0
2018-12-19T14:42:26.022Z,"bgwriter",,,0,0,0,0,0
2018-12-19T14:42:26.024Z,"autovacuum",,,0,0,0,0,0
2018-12-19T14:42:26.022Z,"stats",,,0,0,0,0,0
2018-12-19T14:42:26.022Z,"wal writer",,,0,0,0,0,0
2018-12-19T14:42:26.022Z,"checkpoint",,,0,0,0,0,0
2018-12-19T14:42:26.022Z,"other",,,0,0,0,0,0
```
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.ongres.pgio</groupId>
<artifactId>build-resources</artifactId>
<version>1.0.0</version>
<packaging>jar</packaging>
<groupId>com.ongres.pgio</groupId>
<artifactId>build-resources</artifactId>
<version>1.0.0</version>
<packaging>jar</packaging>
<name>pgio: Build Resources</name>
<name>pgio: Build Resources</name>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<checkstyle.skipExec>true</checkstyle.skipExec>
<license.skip>true</license.skip>
</properties>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<checkstyle.skipExec>true</checkstyle.skipExec>
<license.skip>true</license.skip>
</properties>
<profiles>
<profile>
<id>deploy</id> <!-- It must be manually turn on when mvn deploy is executed -->
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<version>3.0.1</version>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar-no-fork</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>2.10.4</version>
<configuration>
<failOnError>false</failOnError>
</configuration>
<executions>
<execution>
<id>attach-javadocs</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
</profiles>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.6.1</version>
<configuration>
<source>${maven.compiler.source}</source>
<target>${maven.compiler.target}</target>
<compilerArgument>-Xlint:all</compilerArgument>
<showDeprecation>true</showDeprecation>
<showWarnings>true</showWarnings>
</configuration>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>deploy</id> <!-- It must be manually turn on when mvn deploy is executed -->
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<version>3.0.1</version>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar-no-fork</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>2.10.4</version>
<configuration>
<failOnError>false</failOnError>
</configuration>
<executions>
<execution>
<id>attach-javadocs</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>
......@@ -29,7 +29,10 @@ import com.ongres.pgio.main.stats.StatSnapshot;
import com.ongres.pgio.main.stats.serializer.CsvSerializer;
import com.ongres.pgio.main.stats.serializer.StatSerializer;
import com.ongres.pgio.main.version.Version;
import fi.iki.elonen.NanoHTTPD;
import joptsimple.OptionException;
import joptsimple.OptionParser;
import joptsimple.OptionSet;
......@@ -49,23 +52,42 @@ import java.util.function.Consumer;
import java.util.function.Function;
import javax.json.Json;
import javax.json.JsonObject;
import javax.json.JsonString;
import javax.json.JsonValue;
import javax.json.JsonValue.ValueType;
public class Main {
public static void main(String[] args) throws Exception {
System.exit(run(args));
}
private static int run(String[] args) throws Exception {
OptionParser parser;
OptionSet options;
try {
parser = createOptionParser();
options = parser.parse(args);
} catch (OptionException ex) {
System.err.println(messageOrType(ex));
return -1;
} catch (RuntimeException ex) {
onException(ex, true);
return -1;
} catch (Exception ex) {
onException(ex, true);
return -1;
}
try {
OptionParser parser = createOptionParser();
OptionSet options = parser.parse(args);
if (options.has("help")) {
parser.printHelpOn(System.out);
System.exit(0);
return 0;
}
if (options.has("version")) {
System.out.println(Version.getVersion());
System.exit(0);
return 0;
}
Config config = configOptionSet(options);
......@@ -75,21 +97,47 @@ public class Main {
} else {
runCollector(config);
}
} catch (Throwable throwable) {
if (throwable.getMessage() != null) {
System.err.println(throwable.getMessage());
} else {
System.err.println("Seems like you hit a bug. Please open an issue with"
+ " the following stack trace at https://gitlab.com/teoincontatto/pgio/issues/new");
System.err.println();
throwable.printStackTrace(System.err);
System.err.println();
}
System.err.println("Try \"pgio --help\" for more information.");
System.exit(1);
} catch (RuntimeException ex) {
return onException(ex, options.has("debug"));
} catch (Exception ex) {
return onException(ex, options.has("debug"));
}
return 0;
}
private static int onException(Exception ex, boolean debug) {
System.err.println(messageOrType(ex));
if (debug) {
onExceptionHint(ex);
}
return -1;
}
private static void onExceptionHint(Exception ex) {
System.err.println();
System.err.println();
System.err.println("If you think you hit a bug. Please open an issue attaching"
+ " the following stack trace at https://gitlab.com/ongresinc/pgio/issues/new");
System.err.println();
ex.printStackTrace(System.err);
System.err.println();
System.err.println("Try \"pgio --help\" for more information.");
System.err.println();
}
private static String messageOrType(Exception ex) {
String message = ex.getMessage();
if (message == null) {
return ex.getClass().getName();
}
return message;
}
private static void runPrometheusService(Config config) throws IOException {
PrometheusService service = new PrometheusService(config);
service.start(NanoHTTPD.SOCKET_READ_TIMEOUT, false);
......@@ -193,17 +241,33 @@ public class Main {
File groupConfigFile = new File(config);
try (InputStream inputStream = new FileInputStream(groupConfigFile)) {
ImmutableList.Builder<ProcessGroupInfo> builder = ImmutableList.builder();
Json.createReader(inputStream).readObject().forEach((key, value) -> {
ProcessGroupInfo.Builder groupBuilder = new ProcessGroupInfo.Builder(key);
value.asJsonArray()
.forEach(entry -> groupBuilder.addPattern(((JsonString) entry).getString()));
builder.add(groupBuilder.build());
});
JsonValue groups = Json.createReader(inputStream).readValue();
if (groups.getValueType() == ValueType.ARRAY) {
Json.createReader(inputStream).readArray().forEach(element -> {
if (element.getValueType() == ValueType.OBJECT) {
appendFromObject(builder, element.asJsonObject());
}
});
} else if (groups.getValueType() == ValueType.OBJECT) {
appendFromObject(builder, groups.asJsonObject());
}
return builder.build();
} catch (IOException ex) {
throw new RuntimeException(ex);
}
}
private static void appendFromObject(ImmutableList.Builder<ProcessGroupInfo> builder,
JsonObject groups) {
groups.asJsonObject().forEach((key, value) -> {
ProcessGroupInfo.Builder groupBuilder = new ProcessGroupInfo.Builder(key);
value.asJsonArray()
.forEach(entry -> groupBuilder.addPattern(((JsonString) entry).getString()));
builder.add(groupBuilder.build());
});
}
private static InetAddress readInetAddress(String address) {
try {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment