ganeti-github.git
5 years agoAdd docstring to certificate verification
Helga Velroyen [Fri, 17 Jul 2015 07:35:40 +0000 (09:35 +0200)]
Add docstring to certificate verification

This adds a bit of documentation to one of the
certificate verification methods to distinguish
them better. This got only apparent after the
merge of 2.12 to 2.13.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoNEWS: move 2.13.0 beta/rc to their place
Klaus Aehlig [Thu, 16 Jul 2015 09:24:31 +0000 (11:24 +0200)]
NEWS: move 2.13.0 beta/rc to their place

Apparently during merges in the past, the NEWS entries
for 2.13.0 rc1 and 2.13.0 beta1 ended up between the entries
for 2.12.4 and 2.12.3. Move them to their rightful place now.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoMerge branch 'stable-2.12' into stable-2.13
Klaus Aehlig [Thu, 16 Jul 2015 09:07:44 +0000 (11:07 +0200)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Bugfix in checkInstanceMove function in Cluster.hs
  Revision bump for 2.12.5
  Update the NEWS file for 2.12.5
  Update Xen documentation in install.rst
  Clarify need for the migration_port Xen param

Conflicts:
NEWS: take both new entries
configure.ac: keep version and revision of stable-2.13

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoBugfix in checkInstanceMove function in Cluster.hs
Oleg Ponomarev [Wed, 15 Jul 2015 17:46:14 +0000 (20:46 +0300)]
Bugfix in checkInstanceMove function in Cluster.hs

checkInstanceMove function tries all possible moves of single instance
in order to found an optimal move. When option --no-disk-moves is
enabled, current implementation tries only Failover move while
FailoverToAny is a suitable move too. This patch fixes the bug.

Signed-off-by: Oleg Ponomarev <onponomarev@gmail.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoDocument data collector options
Klaus Aehlig [Mon, 13 Jul 2015 15:33:32 +0000 (17:33 +0200)]
Document data collector options

The options --enabled-data-collectors and --data-collector-interval have
been added to gnt-cluster modify quite a while ago on stable-2.13. However
they have never been documented in the man page. Do so now.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoRevision bump for 2.12.5 v2.12.5
Petr Pudlak [Mon, 13 Jul 2015 14:02:16 +0000 (16:02 +0200)]
Revision bump for 2.12.5

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

5 years agoUpdate the NEWS file for 2.12.5
Petr Pudlak [Mon, 13 Jul 2015 14:00:57 +0000 (16:00 +0200)]
Update the NEWS file for 2.12.5

... mentioning all the changes.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

5 years agoCorrect NEWS file entry
Hrvoje Ribicic [Mon, 13 Jul 2015 13:22:49 +0000 (15:22 +0200)]
Correct NEWS file entry

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoUpdate Xen documentation in install.rst
Hrvoje Ribicic [Mon, 13 Jul 2015 10:14:50 +0000 (10:14 +0000)]
Update Xen documentation in install.rst

The Xen documentation in install.rst was out of date, describing
xm-specific changes at the point where 2.12 is mostly used with xl.
This patch removes xm-specific migration steps, references the official
Xen wiki instead of replicating information from it, removes the
VNC setup settings that are outdated for xl and probably for xm, and
slightly rewrites the documentation.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoClarify need for the migration_port Xen param
Hrvoje Ribicic [Mon, 13 Jul 2015 10:14:17 +0000 (10:14 +0000)]
Clarify need for the migration_port Xen param

... depending on which toolstack is used.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoRevision bump for 2.13.2 v2.13.2
Hrvoje Ribicic [Mon, 13 Jul 2015 11:30:41 +0000 (11:30 +0000)]
Revision bump for 2.13.2

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoUpdate the NEWS file for 2.13.2
Hrvoje Ribicic [Mon, 13 Jul 2015 11:30:10 +0000 (11:30 +0000)]
Update the NEWS file for 2.13.2

... mentioning all the changes.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoMerge branch 'stable-2.12' into stable-2.13
Klaus Aehlig [Wed, 8 Jul 2015 16:09:30 +0000 (18:09 +0200)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Tell git to ignore tools/ssl-update
  Use 'exclude_daemons' option for master only
  Disable superfluous restarting of daemons
  Add tests exercising the "crashed" state handling
  Add proper handling of the "crashed" Xen state

* stable-2.11
  Fix capitalization of TestCase
  Trigger renew-crypto on downgrade to 2.11

Conflicts:
.gitignore: use all additions

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoProperly get rid of all watcher jobs
Klaus Aehlig [Wed, 8 Jul 2015 15:28:44 +0000 (17:28 +0200)]
Properly get rid of all watcher jobs

Our tests running via RunWithLocks strictly depend on no
watcher jobs interfering. Therefore they pause the watcher;
unfortunately, there still is a race: the watcher only checks
the pause status upon its invocation, but submits jobs later
in its run time. Therefore not only pause it (doesn't hurt),
but also add a filter to reject all its jobs, and then wait
for all running jobs to terminate.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoMove stdout_of to qa_utils
Klaus Aehlig [Wed, 8 Jul 2015 15:28:43 +0000 (17:28 +0200)]
Move stdout_of to qa_utils

...so that it can be used outside the filter test as well.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoMerge branch 'stable-2.11' into stable-2.12
Klaus Aehlig [Wed, 8 Jul 2015 15:36:24 +0000 (17:36 +0200)]
Merge branch 'stable-2.11' into stable-2.12

* stable-2.11
  Fix capitalization of TestCase
  Trigger renew-crypto on downgrade to 2.11

Conflicts:
tools/post-upgrade: use 2.12 condition on when to run the hook

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoTell git to ignore tools/ssl-update
Klaus Aehlig [Wed, 8 Jul 2015 14:38:32 +0000 (16:38 +0200)]
Tell git to ignore tools/ssl-update

This tools was recently added, but not added to .gitignore. Do
so now.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoUse 'exclude_daemons' option for master only
Helga Velroyen [Thu, 2 Jul 2015 13:07:12 +0000 (15:07 +0200)]
Use 'exclude_daemons' option for master only

During 'gnt-cluster renew-crypto --new-cluster-certificate'
or '... --new-node-certficates' all daemons are shutdown,
except for wconfd and noded. So far, noded was not shutdown
on all nodes, although it is only necessary on the master.
This patch makes sure that the 'exclude_daemons' flag only
applies to the master, as all interesting operations will
only need them there.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoDisable superfluous restarting of daemons
Helga Velroyen [Thu, 2 Jul 2015 12:42:20 +0000 (14:42 +0200)]
Disable superfluous restarting of daemons

This patch fixes a little glitch where the Ganeti
daemons were stopped and started unnecessarily if
only the cluster certficate was renewed but nothing
else.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoAdd tests exercising the "crashed" state handling
Hrvoje Ribicic [Mon, 6 Jul 2015 17:23:31 +0000 (17:23 +0000)]
Add tests exercising the "crashed" state handling

This patch adds a few tests that make sure the state is handled
properly, using examples taken from a running cluster.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoAdd proper handling of the "crashed" Xen state
Hrvoje Ribicic [Mon, 6 Jul 2015 17:17:41 +0000 (17:17 +0000)]
Add proper handling of the "crashed" Xen state

Whenever an instance would enter the crashed state due to kernel issues
or other horrible problems, Ganeti would not be able to interpret the
data and would report strange and incomprehensible errors. This patch
fixes this by adding proper handling for the "crashed" state.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoMerge branch 'stable-2.12' into stable-2.13
Helga Velroyen [Tue, 7 Jul 2015 09:52:47 +0000 (11:52 +0200)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Handle SSL setup when downgrading
  Write SSH ports to ssconf files
  Noded: Consider certificate chain in callback
  Cluster-keys-replacement: update documentation
  Backend: Use timestamp as serial no for server cert
  UPGRADE: add note about 2.12.5
  NEWS: Mention issue 1094
  man: mention changes in renew-crypto
  Verify: warn about self-signed client certs
  Bootstrap: validate SSL setup before starting noded
  Clean up configuration of curl request
  Renew-crypto: remove superflous copying of node certs
  Renew-crypto: propagate verbose and debug option
  Noded: log the certificate and digest on noded startup
  QA: reload rapi cert after renew crypto
  Prepare-node-join: use common functions
  Renew-crypto: remove dead code
  Init: add master client certificate to configuration
  Renew-crypto: rebuild digest map of all nodes
  Noded: make "bootstrap" a constant
  node-daemon-setup: generate client certificate
  tools: Move (Re)GenerateClientCert to common
  Renew cluster and client certificates together
  Init: create the master's client cert in bootstrap
  Renew client certs using ssl_update tool
  Run functions while (some) daemons are stopped
  Back up old client.pem files
  Introduce ssl_update tool
  x509 function for creating signed certs
  Add tools/common.py from 2.13
  Consider ECDSA in SSH setup
  Update documentation of watcher and RAPI daemon
  Watcher: add option for setting RAPI IP
  When connecting to Metad fails, log the full stack trace
  Set up the Metad client with allow_non_master
  Set up the configuration client properly on non-masters
  Add the 'allow_non_master' option to the WConfd RPC client
  Add the option to disable master checks to the RPC client
  Add 'allow_non_master' to the Luxi test transport class too
  Add 'allow_non_master' to FdTransport for compatibility
  Properly document all constructor arguments of Transport
  Allow the Transport class to be used for non-master nodes
  Don't define the set of all daemons twice

Conflicts:
  Makefile.am
  NEWS
  UPGRADE
  lib/client/gnt_cluster.py
  lib/cmdlib/cluster.py
  lib/tools/common.py
  lib/tools/prepare_node_join.py
  lib/watcher/__init__.py
  man/ganeti-watcher.rst
  src/Ganeti/OpCodes.hs
  test/hs/Test/Ganeti/OpCodes.hs
  test/py/cmdlib/cluster_unittest.py
  test/py/ganeti.tools.prepare_node_join_unittest.py
  tools/cfgupgrade

Resolutions:
  Makefile.am:
    add ssl_update and ssh_update
  NEWS:
    add new sections from 2.12 and 2.13
  UPGRADE:
    add notes for both 2.12 and 2.13
  lib/client/gnt_cluster.py:
    add all new options to RenewCluster, remove version-specific
    downgrade code
  lib/tools/common.py:
    split the two mismatching versions of _VerifyCertificate
    and VerifyCertificate up into [_]VerifyCertifcate{Soft,Strong}
    and update usages accordingly
  lib/tools/prepare_node_join.py
    update usage of correct VerifyCertificate function
  lib/watcher/__init__.py
    add both new options, --rapi-ip and --no-verify-disks
  man/ganeti-watcher.rst
    update docs for both new options (see above)
  src/Ganeti/OpCodes.hs
    add all new options to OpRenewCrypto
  test/hs/Test/Ganeti/OpCodes.hs
    add enough 'arbitrary' for all new options of OpRenewCrypto
  test/py/cmdlib/cluster_unittest.py
    use changes from 2.12
  test/py/ganeti.tools.prepare_node_join_unittest.py
    remove tests that were moved to common_unittest.py
  tools/cfgupgrade
    use only downgrade code of 2.13

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoDescribe --no-verify-disks option in watcher man page
Klaus Aehlig [Mon, 6 Jul 2015 10:54:46 +0000 (12:54 +0200)]
Describe --no-verify-disks option in watcher man page

While there, also mention that it does more than checking
for rebooted nodes.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoMake disk verification optional
Klaus Aehlig [Mon, 6 Jul 2015 10:35:17 +0000 (12:35 +0200)]
Make disk verification optional

In some setups, verification of disks can take a long
time, whereas it is still desirable to run the other
watcher operations more regularly. Hence support this
use case by providing an option to not run disk verification,
allowing for more elaborate cron schedules. Fixes issue 1090.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoHandle SSL setup when downgrading
Helga Velroyen [Wed, 1 Jul 2015 08:45:02 +0000 (10:45 +0200)]
Handle SSL setup when downgrading

This patch will handle the downgrade of the SSL setup
from 2.12 to 2.11. Essentially, all client.pem and
ssconf_master_candidates_certs files will be deleted.
This will kick the cluster in a pre-2.11 mode wrt to
SSL and result in a nagging message to re-run
'gnt-cluster renew-crypto' when as output of 'gnt-cluster
verify'.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoWrite SSH ports to ssconf files
Helga Velroyen [Tue, 30 Jun 2015 08:48:11 +0000 (10:48 +0200)]
Write SSH ports to ssconf files

For the downgrading of the SSL setup from 2.12 to 2.11, we
need to be able to SSH into machines while no daemons are
running. Unfortunately currently the only way to obtain
custom-configured SSH ports is by queries. In order to
access this information with daemons being shutdown, this
patch adds the SSH port information to an ssconf file.

This will also be used to simplify some backend calls for
the *SSH* handling in 2.13.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoNoded: Consider certificate chain in callback
Helga Velroyen [Wed, 24 Jun 2015 12:19:17 +0000 (14:19 +0200)]
Noded: Consider certificate chain in callback

This patch significantly changes the callback that is
called upon receiving an incoming SSL connection. Since
this callback is called not only with the certificate
that the client sends, but also (in some implementations)
with the entire certificate chain of the client
certificate.

In our case, the certficate chain contains
the client certificate and the server certificate as
the one that signed the client certificate. This means
that we have to accept the server certificate, but only
if we receive it with the 'depth' greater than 0, meaning
that this is part of the chain and not the actual
certificate. If the depth value is 0, we can be sure
to have received the actual certficate and match it
against the list of master candidate certificates as
before.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoCluster-keys-replacement: update documentation
Helga Velroyen [Wed, 24 Jun 2015 12:03:03 +0000 (14:03 +0200)]
Cluster-keys-replacement: update documentation

This patch updates the cluster-keys-replacement document
which assists user about how to replace the crypto keys
for their cluster. This now reflects the changes wrt
server/client certificates.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoBackend: Use timestamp as serial no for server cert
Helga Velroyen [Wed, 24 Jun 2015 11:27:30 +0000 (13:27 +0200)]
Backend: Use timestamp as serial no for server cert

So far, all of Ganeti's server certificates had the serial
number '1'. While this works, it makes it hard to
distinguish situations where the certificate is
renewed from those where it wasn't. This patch uses
a timestamp as serial number.

While this is still not stricly according to the SSL RFC,
it is at least a number that is stricly growing and we
can be sure that no two different server certificates
will have the same serial number.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoUPGRADE: add note about 2.12.5
Helga Velroyen [Wed, 24 Jun 2015 11:22:02 +0000 (13:22 +0200)]
UPGRADE: add note about 2.12.5

This patch adds comments to the upgrade documentation
to advise users to rerun renew-crypto if they update
to 2.12.5.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoNEWS: Mention issue 1094
Helga Velroyen [Wed, 24 Jun 2015 11:19:46 +0000 (13:19 +0200)]
NEWS: Mention issue 1094

This patch updates the NEWS file to advise users to rerun
renew-crypto after an update to 2.12.5.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoman: mention changes in renew-crypto
Helga Velroyen [Wed, 24 Jun 2015 11:17:19 +0000 (13:17 +0200)]
man: mention changes in renew-crypto

This updates the gnt-cluster man page wrt to the changes
about server and client certificates and how they affect
the operation 'gnt-cluster renew-crypto'.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoVerify: warn about self-signed client certs
Helga Velroyen [Wed, 24 Jun 2015 09:56:23 +0000 (11:56 +0200)]
Verify: warn about self-signed client certs

Since from this patch series on, there should be no
self-sigend certificates in a cluster anymore, add
a warning to cluster-verify to nag people to renew
their certificates.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoBootstrap: validate SSL setup before starting noded
Helga Velroyen [Mon, 22 Jun 2015 13:01:04 +0000 (15:01 +0200)]
Bootstrap: validate SSL setup before starting noded

This patch adds a few checks which ensure that all
files necessary for proper SSL communication are
in place before noded is started on the master node.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoClean up configuration of curl request
Helga Velroyen [Mon, 22 Jun 2015 12:43:12 +0000 (14:43 +0200)]
Clean up configuration of curl request

This is a small patch cleaning up some thing in the
composition of the pycurl object for RPC calls.
For example, it removes some superfluous 'str' and
increases the logging level to warning when the
server cert is used.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoRenew-crypto: remove superflous copying of node certs
Helga Velroyen [Mon, 22 Jun 2015 08:59:09 +0000 (10:59 +0200)]
Renew-crypto: remove superflous copying of node certs

Since now the server certificates are copied in their
own dedicated function, remove adding their file name
in the general function for renewing crypto tokens.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoRenew-crypto: propagate verbose and debug option
Helga Velroyen [Fri, 19 Jun 2015 11:36:06 +0000 (13:36 +0200)]
Renew-crypto: propagate verbose and debug option

This patch enables the user to add --debug and/or --verbose
to the call of 'renew-crypto'. This way, more output is
shown to debug SSL problems easier.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoNoded: log the certificate and digest on noded startup
Helga Velroyen [Fri, 19 Jun 2015 09:52:36 +0000 (11:52 +0200)]
Noded: log the certificate and digest on noded startup

This patch adds logging of the filename and the digest of the
certificate which is loaded by noded on startup. This will
help debugging SSL problems as it will make clear whether or
not the noded is still using a stale/replaced/old server
certificate after a renewal.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoQA: reload rapi cert after renew crypto
Helga Velroyen [Thu, 18 Jun 2015 13:52:30 +0000 (15:52 +0200)]
QA: reload rapi cert after renew crypto

When running the QA, we copy the rapi certficate to the
machine which steers the QA to use it later in the QA
for testing RAPI calls. However, before we get to that
part of the QA, the rapi certificate is replaced when
'gnt-renew crypto' is called.

This patch makes sure that the new rapi certificate is
copied to the steering machine so that later RAPI calls
do not fail. It remains mysterious how this worked before.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoPrepare-node-join: use common functions
Helga Velroyen [Tue, 16 Jun 2015 15:46:04 +0000 (17:46 +0200)]
Prepare-node-join: use common functions

This patch makes prepare_node_join use some of the functions
that were moved to tools/common.py. The respective unittests
are removed, because they are already tested in
common_unittest.py.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoRenew-crypto: remove dead code
Helga Velroyen [Tue, 16 Jun 2015 15:35:46 +0000 (17:35 +0200)]
Renew-crypto: remove dead code

This patch removes the code for renewing the master
nodes' client certificate using SSL. This is no longer
needed, as the master nodes' certificate is created
in gnt_cluster.py already.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoInit: add master client certificate to configuration
Helga Velroyen [Tue, 16 Jun 2015 14:17:27 +0000 (16:17 +0200)]
Init: add master client certificate to configuration

This patch adds a few steps to bootstrap.py. After the
creation of the server (cluster) certificate and the
master node's client certificate, the digest of that
client certificate is added to the configuration and
by an update of the configuraiton written to the
ssconf_master_candidates_certs file.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoRenew-crypto: rebuild digest map of all nodes
Helga Velroyen [Tue, 16 Jun 2015 12:40:12 +0000 (14:40 +0200)]
Renew-crypto: rebuild digest map of all nodes

During a renew-crypto operation, all nodes will create
new client certificates. Afterwards, the fingerprints
(digests) of the master candidate nodes needs to be
collected and added to the configuration. This is done
by an RPC call, which will succeed as the master
node's certficate digest was propagated to the nodes
before.

This also removes two unittest which are no longer
necessary, because there will be no RPC call from
the master to itself anymore.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoNoded: make "bootstrap" a constant
Helga Velroyen [Tue, 16 Jun 2015 12:24:11 +0000 (14:24 +0200)]
Noded: make "bootstrap" a constant

Noded uses the constant "bootstrap" when starting
without client certificates. This patch moves the
constant to Constants.hs.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agonode-daemon-setup: generate client certificate
Helga Velroyen [Mon, 15 Jun 2015 14:43:24 +0000 (16:43 +0200)]
node-daemon-setup: generate client certificate

So far, the client certificate of a node that is added
to the cluster was created in LUNodeAdd using an RPC
call. This is now simplified by creating the certificate
already in tools/node_daemon_setup.py and only retrieving
its fingerprint by RPC to add it to the configuration.

This simplifies the backend function from only reading
the fingerprint instead of creating the certificate.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agotools: Move (Re)GenerateClientCert to common
Helga Velroyen [Mon, 15 Jun 2015 14:36:24 +0000 (16:36 +0200)]
tools: Move (Re)GenerateClientCert to common

So far the generation of client certificates was only
called from ssl_update.py used in when calling 'gnt-cluster
renew-crypto'. This patch moves the function from
ssl_update.py to tools/common.py, because it will also
be needed by prepare_node_join.py when adding nodes
(see next patch in the series).

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoRenew cluster and client certificates together
Helga Velroyen [Wed, 10 Jun 2015 10:56:15 +0000 (12:56 +0200)]
Renew cluster and client certificates together

So far, the cluster certificate and the individual node
certificate could be renewed independent of each other.
This is no longer possible, because when renewing the
server certificate, all node certificates need to be
renewed as well, because they are signed by the server
certificate. This patch couples the two operations
together.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoInit: create the master's client cert in bootstrap
Helga Velroyen [Tue, 9 Jun 2015 15:56:09 +0000 (17:56 +0200)]
Init: create the master's client cert in bootstrap

This patch extends bootstrap.py to not only create
the cluster certificate but also the master node's
client certificate.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoRenew client certs using ssl_update tool
Helga Velroyen [Tue, 9 Jun 2015 12:19:15 +0000 (14:19 +0200)]
Renew client certs using ssl_update tool

This patch integrates renewing the client certificate
of non-master nodes using the new ssl_update tool.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoRun functions while (some) daemons are stopped
Helga Velroyen [Tue, 9 Jun 2015 09:10:04 +0000 (11:10 +0200)]
Run functions while (some) daemons are stopped

For the new renew-crypto operation, we need to run
functions while most of the daemons are stopped,
except for WConfd. This refactors our code a bit
and generalizes the method that runs functions
while *all* daemons are stopped to one that
accepts a list of daemons to not be stopped.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoBack up old client.pem files
Helga Velroyen [Mon, 8 Jun 2015 09:43:00 +0000 (11:43 +0200)]
Back up old client.pem files

For post-mortems, let's make a backup of the client
certificate before renewing them.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoIntroduce ssl_update tool
Helga Velroyen [Fri, 5 Jun 2015 13:45:00 +0000 (15:45 +0200)]
Introduce ssl_update tool

In order to renew client certificates via SSH (rather than
on the fly via SSL as it was before), we need a new tool
which can be called on remote nodes via SSH.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agox509 function for creating signed certs
Helga Velroyen [Fri, 5 Jun 2015 13:35:00 +0000 (15:35 +0200)]
x509 function for creating signed certs

So far, all our SSL certficates were self-signed. As from
this patch series on client certificates will be signed by
the cluster certificate, we need a utility function for
creation of not self-signed certificates.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoAdd tools/common.py from 2.13
Helga Velroyen [Wed, 3 Jun 2015 11:53:15 +0000 (13:53 +0200)]
Add tools/common.py from 2.13

We will need some functions from tools/common.py, which
are only present from 2.13 on. Unfortunately there were
not clear commits for that, so cherry-picking is not
an option. This patch simply copies the file and one
has to be careful with the next merge.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoConsider ECDSA in SSH setup
Helga Velroyen [Wed, 1 Jul 2015 12:24:11 +0000 (14:24 +0200)]
Consider ECDSA in SSH setup

So far, Ganeti did only care about DSA and RSA host
keys. With the rising popularity of ECDSA, we should
support this key type as well, as it is already
enabled by default in many common distributions.

This fixes Issue 1098.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoUpdate documentation of watcher and RAPI daemon
Helga Velroyen [Thu, 2 Jul 2015 12:10:24 +0000 (14:10 +0200)]
Update documentation of watcher and RAPI daemon

.. to reflect the relationship between the RAPI daemons'
-b option and the watchers --rapi-ip option.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoWatcher: add option for setting RAPI IP
Helga Velroyen [Thu, 2 Jul 2015 11:38:17 +0000 (13:38 +0200)]
Watcher: add option for setting RAPI IP

Per default, the RAPI daemon binds to 0.0.0.0 when being
started. This means it serves from any IP the machine is
configured for. This works well together with the watcher
which always polls the RAPI daemons on 127.0.0.1 and
restarts it when it is not reachable.

If a user decides to start the RAPI daemon with a particular
IP other than 127.0.0.1 (using the option -b, for example
set in /etc/default/ganeti), RAPI will only serve from that
IP and thus it will not be reachable from 127.0.0.1. Since
the watcher only polls on this IP, it will inevitably fail
to connect to the RAPI daemon and thus restart it every five
minutes.

To solve this, this patch adds an option --rapi-ip to the
watcher. Whenever -b of the RAPI daemon is set, the watcher
needs to be fed the same IP with --rapi-ip (which means
editing /etc/cron.d/ganeti for example). This is not optimal
regarding user experience (as it is easy to forget one of
the two places), but the alternative would be to make this
a ganeti configuration parameter which is fed to both, RAPI
daemon and watcher, but this would be significantly more
effort for this relatively rarely used feature.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoFix capitalization of TestCase
Helga Velroyen [Fri, 3 Jul 2015 09:04:21 +0000 (11:04 +0200)]
Fix capitalization of TestCase

.. and with this unbreak the tests.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoTrigger renew-crypto on downgrade to 2.11
Helga Velroyen [Wed, 1 Jul 2015 08:16:34 +0000 (10:16 +0200)]
Trigger renew-crypto on downgrade to 2.11

With the upcoming changes in 2.12, is it necessary to run
'gnt-cluster renew-crypto --new-node-certificates'. To
ensure that our QA runs smoothely, this means that this
command needs to be added to the post-upgrade hooks of
2.11. To ensure that it is only run when coming from
2.12.X or from before 2.11, the utility functions are
extended by an equal operator for versions.

Note that it is unlikely that 2.11 will get another release,
so this is mainly to fix our QA. However, users downgrading
to a previous version of 2.11 will get a nagging message
to re-run renew-crypto manually.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoWhen connecting to Metad fails, log the full stack trace
Petr Pudlak [Thu, 2 Jul 2015 13:27:13 +0000 (15:27 +0200)]
When connecting to Metad fails, log the full stack trace

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoSet up the Metad client with allow_non_master
Petr Pudlak [Thu, 2 Jul 2015 13:11:00 +0000 (15:11 +0200)]
Set up the Metad client with allow_non_master

.. since the communication takes place on non-master nodes.

This ensures the client properly retries if there is a communication
failure.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoSet up the configuration client properly on non-masters
Petr Pudlak [Thu, 2 Jul 2015 09:29:28 +0000 (11:29 +0200)]
Set up the configuration client properly on non-masters

If the configuration client is opened in the 'accept_foreign' mode,
meaning it is running on a non-master node temporarily, the option
needs to be propagated to the RPC client as well.

This fixes issue #1115.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoAdd the 'allow_non_master' option to the WConfd RPC client
Petr Pudlak [Thu, 2 Jul 2015 09:28:41 +0000 (11:28 +0200)]
Add the 'allow_non_master' option to the WConfd RPC client

While at it, fix the call to the AbstractStubClient to properly pass the
keyword arguments.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoAdd the option to disable master checks to the RPC client
Petr Pudlak [Thu, 2 Jul 2015 09:26:18 +0000 (11:26 +0200)]
Add the option to disable master checks to the RPC client

The option is propagated to the Transport class and allows to disable
checks for the master node, if the client is run on a different node on
purpose.

While at it, fix the documentation for the arguments of the constructors
of the classes.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoAdd 'allow_non_master' to the Luxi test transport class too
Petr Pudlak [Thu, 2 Jul 2015 11:51:34 +0000 (13:51 +0200)]
Add 'allow_non_master' to the Luxi test transport class too

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoAdd 'allow_non_master' to FdTransport for compatibility
Petr Pudlak [Thu, 2 Jul 2015 09:27:13 +0000 (11:27 +0200)]
Add 'allow_non_master' to FdTransport for compatibility

Since it serves as an alternative to the Transport class, it should
support the same constructor options, even if it doesn't use them.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoProperly document all constructor arguments of Transport
Petr Pudlak [Thu, 2 Jul 2015 09:58:02 +0000 (11:58 +0200)]
Properly document all constructor arguments of Transport

.. while documenting allow_non_master.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoAllow the Transport class to be used for non-master nodes
Petr Pudlak [Fri, 5 Jun 2015 12:13:48 +0000 (14:13 +0200)]
Allow the Transport class to be used for non-master nodes

If a communication failure occurred and the caller was not running on
the master node, Transport assumed that this itself was the cause of
the error condition.

However, for communication with the metadata daemon we need to support
non-master nodes as well.

Add a parameter that allows to use the class on non-master nodes.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

Cherry-picked-from: ade70feb258a57ae0565395ba48ac2b3ef02b1c0
Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoDon't define the set of all daemons twice
Klaus Aehlig [Tue, 30 Jun 2015 15:20:33 +0000 (17:20 +0200)]
Don't define the set of all daemons twice

Currently, we have two places where we define the
list of all Ganeti daemons: the type GanetiDaemon
in Ganeti.Runtime and the constant daemons
in Ganeti.Constants. Avoid this duplication by
using Bounded GanetiDaemons and Enum GanetiDaemons.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoFull QuickCheck 2.7 compatibility
Niklas Hambuechen [Fri, 7 Nov 2014 23:51:34 +0000 (00:51 +0100)]
Full QuickCheck 2.7 compatibility

This renames the deprecated `printTestCase` to its replacement
`counterexample`, add provides a CPP-guarded fallback for QuickCheck < 2.7.

Signed-off-by: Niklas Hambuechen <niklash@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

Conflicts:
test/hs/Test/Ganeti/TestCommon.hs
test/hs/Test/Ganeti/Utils/Statistics.hs
Resolution:
test/hs/TeGst/Ganeti/TestCommon.hs: keep changes introduced
           in 2.12 after the previous cherry-picking of this patch
test/hs/Test/Ganeti/Utils/Statistics.hs: keep the current limit
           1e-10 in prop_stddev_update

Cherry-picked-from: 077c415a09f8c381ce788ebe6c065d8ccab60564
Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoQuickCheck 2.7 compatibility
Niklas Hambuechen [Fri, 7 Nov 2014 22:48:46 +0000 (23:48 +0100)]
QuickCheck 2.7 compatibility

This makes our test compile with out errors with QuickCheck 2.7.
Warnings about the deprecation of printTestCase remain when using 2.7.

This change is backwards-compatible with all older versions of QuickCheck
that we support.

In 2.7, Property is no longer a monad, but remains a `Gen Prop` inside,
so that we only have to use combinations of `property` and `return`
to become compatible.

See
  https://hackage.haskell.org/package/QuickCheck-2.7.6/changelog

Further, in QuickCheck 2.7, Positive/NonZero/NonNegative are no longer
instances of `Integral` (NonNegative could likely still be one, see
https://github.com/nick8325/quickcheck/issues/31).
Consequently we cannot create them using `fromIntegral` any more,
and switch to `fromEnum` instead, which also is backwards-compatible.

Signed-off-by: Niklas Hambuechen <niklash@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

Cherry-picked-from: 4320ba1dcfe49b659abbc46a6cf37e6a4db66f22
Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoMerge branch 'stable-2.12' into stable-2.13
Petr Pudlak [Mon, 29 Jun 2015 14:25:27 +0000 (16:25 +0200)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Update design doc with solution for Issue 1094
  Prevent multiple communication nics for one instance
  Remove outdated reference to ganeti-masterd
  Update ganeti-luxid man page
  Add a man page for ganeti-wconfd
  Make htools tolerate missing "dtotal" and "dfree" on luxi
  Get QuickCheck 2.7 compatibility
  TestCommon: Fix QuickCheck import warnings
  Full QuickCheck 2.7 compatibility
  Add a CPP macro for checking the version of QuickCheck
  QuickCheck 2.7 compatibility

* stable-2.11
  Downgrade log-message for rereading job
  Dowgrade log-level for successful requests

Conflicts:
test/hs/Test/Ganeti/TestCommon.hs
Resolution:
test/hs/Test/Ganeti/TestCommon.hs: keep additions from both
          versions

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoMerge branch 'stable-2.11' into stable-2.12
Petr Pudlak [Mon, 29 Jun 2015 13:33:39 +0000 (15:33 +0200)]
Merge branch 'stable-2.11' into stable-2.12

* stable-2.11
  Downgrade log-message for rereading job
  Dowgrade log-level for successful requests

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoDowngrade log-message for rereading job
Klaus Aehlig [Mon, 29 Jun 2015 09:34:13 +0000 (11:34 +0200)]
Downgrade log-message for rereading job

The fact that luxid is rereading a job file because it has
changed on disk is mainly of internal interest for debugging.
Hence downgrade the log-level accordingly.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoDowgrade log-level for successful requests
Klaus Aehlig [Mon, 29 Jun 2015 09:30:13 +0000 (11:30 +0200)]
Dowgrade log-level for successful requests

Originally, only queries used the be served by haskell daemons
over domain sockets. As they were not too frequent, it was OK
to log each of them at INFO level. However, with requests as
frequent as WaitForJobChange served via luxid, logs fill up
to quickly. So log at debug level only.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoUpdate design doc with solution for Issue 1094
Helga Velroyen [Tue, 2 Jun 2015 11:52:25 +0000 (13:52 +0200)]
Update design doc with solution for Issue 1094

Fixing issue 1094 unfortunately will result in a bigger
change. This change is big enough to be documented in
the node-security design doc.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoPrevent multiple communication nics for one instance
Lisa Velden [Tue, 23 Jun 2015 15:16:42 +0000 (17:16 +0200)]
Prevent multiple communication nics for one instance

Check if a nic name is already in the list of all nics before adding it.
Expand the instance name before that check to ensure that we are always
checking for the correct name.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

5 years agoRemove outdated reference to ganeti-masterd
Klaus Aehlig [Mon, 22 Jun 2015 16:24:38 +0000 (16:24 +0000)]
Remove outdated reference to ganeti-masterd

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoUpdate ganeti-luxid man page
Klaus Aehlig [Mon, 22 Jun 2015 16:21:52 +0000 (18:21 +0200)]
Update ganeti-luxid man page

The luxid has taken over more tasks than just queries. Also, remove
outdated references to masterd and split-queries.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoAdd a man page for ganeti-wconfd
Klaus Aehlig [Mon, 22 Jun 2015 15:50:43 +0000 (17:50 +0200)]
Add a man page for ganeti-wconfd

This daemon was added with the jobs-as-processes refactoring,
but a man page has not been added so far. Do this now.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoMake htools tolerate missing "dtotal" and "dfree" on luxi
Klaus Aehlig [Tue, 16 Jun 2015 09:15:48 +0000 (11:15 +0200)]
Make htools tolerate missing "dtotal" and "dfree" on luxi

If a cluster allows sharedfile as only disk template, the amount of
total and free disk space might not be available. This is perfectly
normal, hence make the luxi backend handle it gracefully and just report
0 available disk on 0 total disk.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

Cherry-picked-from: 49644203823562de0945de3feca5dfaa0cc2dc9c
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

5 years agoGet QuickCheck 2.7 compatibility
Klaus Aehlig [Mon, 1 Jun 2015 11:00:29 +0000 (13:00 +0200)]
Get QuickCheck 2.7 compatibility

Replace deprecated `printTestCase` by its replacement `counterexample`.
Note that commit 077c415a added a CPP-guarded fallback for QuickCheck < 2.7.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

Cherry-picked-from: 693db8a9e7a3e3b855350b9f558251bce1718d07
Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoTestCommon: Fix QuickCheck import warnings
Niklas Hambuechen [Tue, 2 Dec 2014 14:22:03 +0000 (15:22 +0100)]
TestCommon: Fix QuickCheck import warnings

This only appears on systems with QuickCheck >= 2.7.

For TestCommon, it happens because the QC qualified name is only used
in the conditional section.
Fixed by making the import conditional as well.

For Statistics, the `Test.Ganeti.TestCommon` import was not necessary
for QC 2.7 because there `Test.QuickCheck` already provides `counterexample`.
Fixed by giving an import list for `Test.QuickCheck`.

Signed-off-by: Niklas Hambuechen <niklash@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

Cherry-picked-from: 53bec60146dd49339e1315bfad7884ae89cd39d9
Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoFull QuickCheck 2.7 compatibility
Niklas Hambuechen [Fri, 7 Nov 2014 23:51:34 +0000 (00:51 +0100)]
Full QuickCheck 2.7 compatibility

This renames the deprecated `printTestCase` to its replacement
`counterexample`, add provides a CPP-guarded fallback for QuickCheck < 2.7.

Signed-off-by: Niklas Hambuechen <niklash@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

Conflicts:
test/hs/Test/Ganeti/JQScheduler.hs
          - removed file not present in 2.12
test/hs/Test/Ganeti/SlotMap.hs
          - removed file not present in 2.12

Cherry-picked-from: 077c415a09f8c381ce788ebe6c065d8ccab60564
Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoAdd a CPP macro for checking the version of QuickCheck
Petr Pudlak [Mon, 22 Jun 2015 12:41:07 +0000 (14:41 +0200)]
Add a CPP macro for checking the version of QuickCheck

.. to TestCommon as a preparation for cherry-picking changes that
need it.

The macro and the version detection will be removed in 2.14 where the
functionality is replaced with cabal.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoQuickCheck 2.7 compatibility
Niklas Hambuechen [Fri, 7 Nov 2014 22:48:46 +0000 (23:48 +0100)]
QuickCheck 2.7 compatibility

This makes our test compile with out errors with QuickCheck 2.7.
Warnings about the deprecation of printTestCase remain when using 2.7.

This change is backwards-compatible with all older versions of QuickCheck
that we support.

In 2.7, Property is no longer a monad, but remains a `Gen Prop` inside,
so that we only have to use combinations of `property` and `return`
to become compatible.

See
  https://hackage.haskell.org/package/QuickCheck-2.7.6/changelog

Further, in QuickCheck 2.7, Positive/NonZero/NonNegative are no longer
instances of `Integral` (NonNegative could likely still be one, see
https://github.com/nick8325/quickcheck/issues/31).
Consequently we cannot create them using `fromIntegral` any more,
and switch to `fromEnum` instead, which also is backwards-compatible.

Signed-off-by: Niklas Hambuechen <niklash@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

Conflicts:
test/hs/Test/Ganeti/JQScheduler.hs - removed file not present in
          2.12

Cherry-picked-from: 4320ba1dcfe49b659abbc46a6cf37e6a4db66f22
Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoMerge branch 'stable-2.12' into stable-2.13
Hrvoje Ribicic [Fri, 19 Jun 2015 15:06:20 +0000 (15:06 +0000)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Fix name of filter-evaluation function
  Call the filter again with runtime data this time
  Fix user and group ordering in test

Conflicts:
src/Ganeti/Query/Query.hs

Resolution:
Query.hs: Used function that exists in 2.13

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoFix name of filter-evaluation function
Klaus Aehlig [Fri, 19 Jun 2015 11:13:39 +0000 (13:13 +0200)]
Fix name of filter-evaluation function

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoCall the filter again with runtime data this time
BSRK Aditya [Fri, 19 Jun 2015 10:19:52 +0000 (12:19 +0200)]
Call the filter again with runtime data this time

genericQuery filters objects without runtime data first.
We need to filter the objects again, this time with runtime data.

This fixes issue 1100.

Signed-off-by: BSRK Aditya <bsrk@google.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoBump revision number to 2.13.1 v2.13.1
Hrvoje Ribicic [Tue, 16 Jun 2015 16:41:29 +0000 (18:41 +0200)]
Bump revision number to 2.13.1

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoUpdate NEWS file for the 2.13.1 release
Hrvoje Ribicic [Tue, 16 Jun 2015 16:41:07 +0000 (18:41 +0200)]
Update NEWS file for the 2.13.1 release

Mentioning various improvements.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

5 years agoFix user and group ordering in test
Hrvoje Ribicic [Mon, 15 Jun 2015 16:45:18 +0000 (18:45 +0200)]
Fix user and group ordering in test

One of our Haskell tests asserts that the Python and Haskell user and
group constants match. This patch fixes the order in which the mock
Python code outputs the users and groups to match the order of the
Haskell-side enumeration.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

5 years agoMerge branch 'stable-2.12' into stable-2.13
Hrvoje Ribicic [Mon, 15 Jun 2015 10:58:11 +0000 (12:58 +0200)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Fix tests for setting (shared) file storage directory
  Add missing call for setting shared file storage directory
  Update ganeti-luxid synopsis
  Update ganeti-mond synopsis
  Update ganeti-confd synopsis
  Update copyright statement

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoMention migration change in NEWS
Hrvoje Ribicic [Wed, 10 Jun 2015 11:51:28 +0000 (13:51 +0200)]
Mention migration change in NEWS

This patch adds information about the xl migration change to the NEWS
file.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoMove misplaced NEWS entry
Hrvoje Ribicic [Wed, 10 Jun 2015 11:50:47 +0000 (13:50 +0200)]
Move misplaced NEWS entry

A section meant for 2.13.0 was moved to a 2.12 release, and this patch
fixes that.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoAdd protection against daemons that may already be listening
Hrvoje Ribicic [Mon, 8 Jun 2015 16:35:27 +0000 (16:35 +0000)]
Add protection against daemons that may already be listening

Should the migration port already be taken, Ganeti will try and start a
socat daemon that will immediately die, leaving Ganeti to pipe the
migration data into whatever process that happens to be listening. This
patch prevents that from happening by checking if the socat daemon
started by Ganeti is ready to accept the migration data.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoAttempt to cleanup failed migrations using a pidfile
Hrvoje Ribicic [Mon, 8 Jun 2015 16:44:15 +0000 (16:44 +0000)]
Attempt to cleanup failed migrations using a pidfile

In the case that a listening socat daemon was started but the migration
failed on the sending side, the daemon will stay in place and occupy
the migration port forever. This patch attempts to remedy this by
saving the PID of the daemon, and attempting to kill it when the next
migration is started, provided the command line roughly matches our
migration workflow.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoAdd utility that gets the full command line of a process
Hrvoje Ribicic [Sun, 7 Jun 2015 21:12:22 +0000 (23:12 +0200)]
Add utility that gets the full command line of a process

This patch provides a simple function which fetches the command line of
a process given its PID, and some tests for it. It was introduced for
safety reasons in introducing socat-based migration to our Xen-handling
code.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoIntroduce socat as a way of doing xl migrations
Hrvoje Ribicic [Tue, 2 Jun 2015 11:05:37 +0000 (11:05 +0000)]
Introduce socat as a way of doing xl migrations

This patch introduces support for socat as a means of doing xl
migrations. The primary reason for doing so is that Ganeti no longer
handles SSH key distribution across nodes which are not master
candidates. By relying on SSH as the only means of doing migrations,
we could only migrate instances off of master candidates.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5 years agoFix tests for setting (shared) file storage directory
Petr Pudlak [Wed, 10 Jun 2015 09:10:01 +0000 (11:10 +0200)]
Fix tests for setting (shared) file storage directory

- Fix the test for setting file_storage_dir, which didn't check if the
  value was really set.
- Add tests for shared_file_storage_dir, which were missing completely.

Signed-off-by: Petr Pudlak <pudlak@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>