Skip to content

Commit 141e946

Browse files
author
Yorick Peterse
committed
Storing of application metrics in InfluxDB
This adds the ability to write application metrics (e.g. SQL timings) to InfluxDB. These metrics can in turn be visualized using Grafana, or really anything else that can read from InfluxDB. These metrics can be used to track application performance over time, between different Ruby versions, different GitLab versions, etc. == Transaction Metrics Currently the following is tracked on a per transaction basis (a transaction is a Rails request or a single Sidekiq job): * Timings per query along with the raw (obfuscated) SQL and information about what file the query originated from. * Timings per view along with the path of the view and information about what file triggered the rendering process. * The duration of a request itself along with the controller/worker class and method name. * The duration of any instrumented method calls (more below). == Sampled Metrics Certain metrics can't be directly associated with a transaction. For example, a process' total memory usage is unrelated to any running transactions. While a transaction can result in the memory usage going up there's no accurate way to determine what transaction is to blame, this becomes especially problematic in multi-threaded environments. To solve this problem there's a separate thread that takes samples at a fixed interval. This thread (using the class Gitlab::Metrics::Sampler) currently tracks the following: * The process' total memory usage. * The number of file descriptors opened by the process. * The amount of Ruby objects (using ObjectSpace.count_objects). * GC statistics such as timings, heap slots, etc. The default/current interval is 15 seconds, any smaller interval might put too much pressure on InfluxDB (especially when running dozens of processes). == Method Instrumentation While currently not yet used methods can be instrumented to track how long they take to run. Unlike the likes of New Relic this doesn't require modifying the source code (e.g. including modules), it all happens from the outside. For example, to track `User.by_login` we'd add the following code somewhere in an initializer: Gitlab::Metrics::Instrumentation. instrument_method(User, :by_login) to instead instrument an instance method: Gitlab::Metrics::Instrumentation. instrument_instance_method(User, :save) Instrumentation for either all public model methods or a few crucial ones will be added in the near future, I simply haven't gotten to doing so just yet. == Configuration By default metrics are disabled. This means users don't have to bother setting anything up if they don't want to. Metrics can be enabled by editing one's gitlab.yml configuration file (see config/gitlab.yml.example for example settings). == Writing Data To InfluxDB Because InfluxDB is still a fairly young product I expect the worse. Data loss, unexpected reboots, the database not responding, you name it. Because of this data is _not_ written to InfluxDB directly, instead it's queued and processed by Sidekiq. This ensures that users won't notice anything when InfluxDB is giving trouble. The metrics worker can be started in a standalone manner as following: bundle exec sidekiq -q metrics The corresponding class is called MetricsWorker.
1 parent b2c593d commit 141e946

32 files changed

+1368
-13
lines changed

Gemfile

+6
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,12 @@ gem 'select2-rails', '~> 3.5.9'
208208
gem 'virtus', '~> 1.0.1'
209209
gem 'net-ssh', '~> 3.0.1'
210210

211+
# Metrics
212+
group :metrics do
213+
gem 'influxdb', '~> 0.2', require: false
214+
gem 'connection_pool', '~> 2.0', require: false
215+
end
216+
211217
group :development do
212218
gem "foreman"
213219
gem 'brakeman', '3.0.1', require: false

Gemfile.lock

+18-12
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ GEM
6565
attr_encrypted (1.3.4)
6666
encryptor (>= 1.3.0)
6767
attr_required (1.0.0)
68-
autoprefixer-rails (6.1.1)
68+
autoprefixer-rails (6.1.2)
6969
execjs
7070
json
7171
awesome_print (1.2.0)
@@ -102,7 +102,7 @@ GEM
102102
bundler-audit (0.4.0)
103103
bundler (~> 1.2)
104104
thor (~> 0.18)
105-
byebug (8.2.0)
105+
byebug (8.2.1)
106106
cal-heatmap-rails (0.0.1)
107107
capybara (2.4.4)
108108
mime-types (>= 1.16)
@@ -117,6 +117,7 @@ GEM
117117
activemodel (>= 3.2.0)
118118
activesupport (>= 3.2.0)
119119
json (>= 1.7)
120+
cause (0.1)
120121
charlock_holmes (0.7.3)
121122
chunky_png (1.3.5)
122123
cliver (0.3.2)
@@ -140,10 +141,10 @@ GEM
140141
term-ansicolor (~> 1.3)
141142
thor (~> 0.19.1)
142143
tins (~> 1.6.0)
143-
crack (0.4.2)
144+
crack (0.4.3)
144145
safe_yaml (~> 1.0.0)
145146
creole (0.5.0)
146-
d3_rails (3.5.6)
147+
d3_rails (3.5.11)
147148
railties (>= 3.1.0)
148149
daemons (1.2.3)
149150
database_cleaner (1.4.1)
@@ -230,7 +231,7 @@ GEM
230231
ipaddress (~> 0.5)
231232
nokogiri (~> 1.5, >= 1.5.11)
232233
opennebula
233-
fog-brightbox (0.9.0)
234+
fog-brightbox (0.10.1)
234235
fog-core (~> 1.22)
235236
fog-json
236237
inflecto (~> 0.0.2)
@@ -249,7 +250,7 @@ GEM
249250
fog-core (>= 1.21.0)
250251
fog-json
251252
fog-xml (>= 0.0.1)
252-
fog-sakuracloud (1.4.0)
253+
fog-sakuracloud (1.5.0)
253254
fog-core
254255
fog-json
255256
fog-softlayer (1.0.2)
@@ -277,11 +278,11 @@ GEM
277278
ruby-progressbar (~> 1.4)
278279
gemnasium-gitlab-service (0.2.6)
279280
rugged (~> 0.21)
280-
gemojione (2.1.0)
281+
gemojione (2.1.1)
281282
json
282283
get_process_mem (0.2.0)
283284
gherkin-ruby (0.3.2)
284-
github-linguist (4.7.2)
285+
github-linguist (4.7.3)
285286
charlock_holmes (~> 0.7.3)
286287
escape_utils (~> 1.1.0)
287288
mime-types (>= 1.19)
@@ -298,7 +299,7 @@ GEM
298299
posix-spawn (~> 0.3)
299300
gitlab_emoji (0.2.0)
300301
gemojione (~> 2.1)
301-
gitlab_git (7.2.21)
302+
gitlab_git (7.2.22)
302303
activesupport (~> 4.0)
303304
charlock_holmes (~> 0.7.3)
304305
github-linguist (~> 4.7.0)
@@ -370,6 +371,9 @@ GEM
370371
i18n (0.7.0)
371372
ice_nine (0.11.1)
372373
inflecto (0.0.2)
374+
influxdb (0.2.3)
375+
cause
376+
json
373377
ipaddress (0.8.0)
374378
jquery-atwho-rails (1.3.2)
375379
jquery-rails (3.1.4)
@@ -416,11 +420,11 @@ GEM
416420
net-ldap (0.12.1)
417421
net-ssh (3.0.1)
418422
netrc (0.11.0)
419-
newrelic-grape (2.0.0)
423+
newrelic-grape (2.1.0)
420424
grape
421425
newrelic_rpm
422426
newrelic_rpm (3.9.4.245)
423-
nokogiri (1.6.7)
427+
nokogiri (1.6.7.1)
424428
mini_portile2 (~> 2.0.0.rc2)
425429
nprogress-rails (0.1.6.7)
426430
oauth (0.4.7)
@@ -783,7 +787,7 @@ GEM
783787
coercible (~> 1.0)
784788
descendants_tracker (~> 0.0, >= 0.0.3)
785789
equalizer (~> 0.0, >= 0.0.9)
786-
warden (1.2.3)
790+
warden (1.2.4)
787791
rack (>= 1.0)
788792
web-console (2.2.1)
789793
activemodel (>= 4.0)
@@ -836,6 +840,7 @@ DEPENDENCIES
836840
charlock_holmes (~> 0.7.3)
837841
coffee-rails (~> 4.1.0)
838842
colorize (~> 0.7.0)
843+
connection_pool (~> 2.0)
839844
coveralls (~> 0.8.2)
840845
creole (~> 0.5.0)
841846
d3_rails (~> 3.5.5)
@@ -873,6 +878,7 @@ DEPENDENCIES
873878
hipchat (~> 1.5.0)
874879
html-pipeline (~> 1.11.0)
875880
httparty (~> 0.13.3)
881+
influxdb (~> 0.2)
876882
jquery-atwho-rails (~> 1.3.2)
877883
jquery-rails (~> 3.1.3)
878884
jquery-scrollto-rails (~> 1.4.3)

Procfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,5 @@
33
# lib/support/init.d, which call scripts in bin/ .
44
#
55
web: bundle exec unicorn_rails -p ${PORT:="3000"} -E ${RAILS_ENV:="development"} -c ${UNICORN_CONFIG:="config/unicorn.rb"}
6-
worker: bundle exec sidekiq -q post_receive -q mailers -q archive_repo -q system_hook -q project_web_hook -q gitlab_shell -q incoming_email -q runner -q common -q default
6+
worker: bundle exec sidekiq -q post_receive -q mailers -q archive_repo -q system_hook -q project_web_hook -q gitlab_shell -q incoming_email -q runner -q common -q default -q metrics
77
# mail_room: bundle exec mail_room -q -c config/mail_room.yml

app/workers/metrics_worker.rb

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
class MetricsWorker
2+
include Sidekiq::Worker
3+
4+
sidekiq_options queue: :metrics
5+
6+
def perform(metrics)
7+
prepared = prepare_metrics(metrics)
8+
9+
Gitlab::Metrics.pool.with do |connection|
10+
connection.write_points(prepared)
11+
end
12+
end
13+
14+
def prepare_metrics(metrics)
15+
metrics.map do |hash|
16+
new_hash = hash.symbolize_keys
17+
18+
new_hash[:tags].each do |key, value|
19+
new_hash[:tags][key] = escape_value(value)
20+
end
21+
22+
new_hash
23+
end
24+
end
25+
26+
def escape_value(value)
27+
value.gsub('=', '\\=')
28+
end
29+
end

config/gitlab.yml.example

+17
Original file line numberDiff line numberDiff line change
@@ -421,9 +421,22 @@ production: &base
421421
#
422422
# Ban an IP for one hour (3600s) after too many auth attempts
423423
# bantime: 3600
424+
metrics:
425+
enabled: false
426+
# The name of the InfluxDB database to store metrics in.
427+
database: gitlab
428+
# Credentials to use for logging in to InfluxDB.
429+
# username:
430+
# password:
431+
# The amount of InfluxDB connections to open.
432+
# pool_size: 16
433+
# The timeout of a connection in seconds.
434+
# timeout: 10
424435

425436
development:
426437
<<: *base
438+
metrics:
439+
enabled: false
427440

428441
test:
429442
<<: *base
@@ -466,6 +479,10 @@ test:
466479
user_filter: ''
467480
group_base: 'ou=groups,dc=example,dc=com'
468481
admin_group: ''
482+
metrics:
483+
enabled: false
469484

470485
staging:
471486
<<: *base
487+
metrics:
488+
enabled: false

config/initializers/metrics.rb

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
if Gitlab::Metrics.enabled?
2+
require 'influxdb'
3+
require 'socket'
4+
require 'connection_pool'
5+
6+
# These are manually require'd so the classes are registered properly with
7+
# ActiveSupport.
8+
require 'gitlab/metrics/subscribers/action_view'
9+
require 'gitlab/metrics/subscribers/active_record'
10+
require 'gitlab/metrics/subscribers/method_call'
11+
12+
Gitlab::Application.configure do |config|
13+
config.middleware.use(Gitlab::Metrics::RackMiddleware)
14+
end
15+
16+
Sidekiq.configure_server do |config|
17+
config.server_middleware do |chain|
18+
chain.add Gitlab::Metrics::SidekiqMiddleware
19+
end
20+
end
21+
22+
GC::Profiler.enable
23+
24+
Gitlab::Metrics::Sampler.new.start
25+
end

lib/gitlab/metrics.rb

+52
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
module Gitlab
2+
module Metrics
3+
def self.pool_size
4+
Settings.metrics['pool_size'] || 16
5+
end
6+
7+
def self.timeout
8+
Settings.metrics['timeout'] || 10
9+
end
10+
11+
def self.enabled?
12+
!!Settings.metrics['enabled']
13+
end
14+
15+
def self.pool
16+
@pool
17+
end
18+
19+
def self.hostname
20+
@hostname
21+
end
22+
23+
def self.last_relative_application_frame
24+
root = Rails.root.to_s
25+
metrics = Rails.root.join('lib', 'gitlab', 'metrics').to_s
26+
27+
frame = caller_locations.find do |l|
28+
l.path.start_with?(root) && !l.path.start_with?(metrics)
29+
end
30+
31+
if frame
32+
return frame.path.gsub(/^#{Rails.root.to_s}\/?/, ''), frame.lineno
33+
else
34+
return nil, nil
35+
end
36+
end
37+
38+
@hostname = Socket.gethostname
39+
40+
# When enabled this should be set before being used as the usual pattern
41+
# "@foo ||= bar" is _not_ thread-safe.
42+
if enabled?
43+
@pool = ConnectionPool.new(size: pool_size, timeout: timeout) do
44+
db = Settings.metrics['database']
45+
user = Settings.metrics['username']
46+
pw = Settings.metrics['password']
47+
48+
InfluxDB::Client.new(db, username: user, password: pw)
49+
end
50+
end
51+
end
52+
end

lib/gitlab/metrics/delta.rb

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
module Gitlab
2+
module Metrics
3+
# Class for calculating the difference between two numeric values.
4+
#
5+
# Every call to `compared_with` updates the internal value. This makes it
6+
# possible to use a single Delta instance to calculate the delta over time
7+
# of an ever increasing number.
8+
#
9+
# Example usage:
10+
#
11+
# delta = Delta.new(0)
12+
#
13+
# delta.compared_with(10) # => 10
14+
# delta.compared_with(15) # => 5
15+
# delta.compared_with(20) # => 5
16+
class Delta
17+
def initialize(value = 0)
18+
@value = value
19+
end
20+
21+
# new_value - The value to compare with as a Numeric.
22+
#
23+
# Returns a new Numeric (depending on the type of `new_value`).
24+
def compared_with(new_value)
25+
delta = new_value - @value
26+
@value = new_value
27+
28+
delta
29+
end
30+
end
31+
end
32+
end

lib/gitlab/metrics/instrumentation.rb

+47
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
module Gitlab
2+
module Metrics
3+
# Module for instrumenting methods.
4+
#
5+
# This module allows instrumenting of methods without having to actually
6+
# alter the target code (e.g. by including modules).
7+
#
8+
# Example usage:
9+
#
10+
# Gitlab::Metrics::Instrumentation.instrument_method(User, :by_login)
11+
module Instrumentation
12+
# Instruments a class method.
13+
#
14+
# mod - The module to instrument as a Module/Class.
15+
# name - The name of the method to instrument.
16+
def self.instrument_method(mod, name)
17+
instrument(:class, mod, name)
18+
end
19+
20+
# Instruments an instance method.
21+
#
22+
# mod - The module to instrument as a Module/Class.
23+
# name - The name of the method to instrument.
24+
def self.instrument_instance_method(mod, name)
25+
instrument(:instance, mod, name)
26+
end
27+
28+
def self.instrument(type, mod, name)
29+
return unless Metrics.enabled?
30+
31+
alias_name = "_original_#{name}"
32+
target = type == :instance ? mod : mod.singleton_class
33+
34+
target.class_eval do
35+
alias_method(alias_name, name)
36+
37+
define_method(name) do |*args, &block|
38+
ActiveSupport::Notifications.
39+
instrument("#{type}_method.method_call", module: mod, name: name) do
40+
__send__(alias_name, *args, &block)
41+
end
42+
end
43+
end
44+
end
45+
end
46+
end
47+
end

0 commit comments

Comments
 (0)