ganeti-github.git
3 years agobackend: Use uuid and not idx for bdev symlinks stable-2.14
Dimitris Aragiorgis [Mon, 25 Jan 2016 16:47:43 +0000 (18:47 +0200)]
backend: Use uuid and not idx for bdev symlinks

Until now, backend used the disk's index when creating the symlink
of the corresponding block device. This is problematic. For example
if one starts an instance with three disks [disk0, disk1, disk2],
then removes the middle one (disk1), and then adds a third (disk3),
the disks in config data will be [disk0, disk2, disk3] and thus the
new one will get index 2. When trying to assemble the newly created
disk we will overwrite the disk2's symlink.

Fix the above behavior by creating an additional symlink based on
the disk's uuid and pass this to the hypervisor. We continue to
create an index-based symlink as well, but this behavior is
inherently problematic when using hotplug.

To keep old instances migratable, we still create the old type
of symlink during BlockdevOpen() which is invoked on the target node
just before migration.

Also remove a really old check that did not make any sense anymore
in GetInstanceMigratable().

Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

3 years agoMerge branch 'stable-2.13' into stable-2.14
Brian Foley [Mon, 22 Aug 2016 10:40:10 +0000 (11:40 +0100)]
Merge branch 'stable-2.13' into stable-2.14

* stable-2.13
  Bugfix: migrate needs HypervisorClass, not an instance

Signed-off-by: Brian Foley <bpfoley@google.com>
Reviewed-by: Viktor Bachraty <vbachraty@google.com>

3 years agoBugfix: migrate needs HypervisorClass, not an instance stable-2.13
David Mohr [Mon, 22 Aug 2016 10:31:47 +0000 (11:31 +0100)]
Bugfix: migrate needs HypervisorClass, not an instance

Otherwise it will complain about permissions of
/var/run/ganeti/kvm-hypervisor

Signed-off-by: David Mohr <david@mcbf.net>
Reviewed-by: Brian Foley <bpfoley@google.com>

4 years agoSupport userspace disk URIs for OS import/export scripts
Nicolas Avrutin [Wed, 24 Feb 2016 21:16:14 +0000 (13:16 -0800)]
Support userspace disk URIs for OS import/export scripts

Also handle the case where a disk is accessible by URI only. Support
for the other OS scripts was added in commits
e147e00863fcf24f40fd8d9c9349d1fbce079c81 and
c4f1bff731492b73515089fdfd1c3482be05439c, but import/export scripts
have some additional cases to handle.

Signed-off-by: Nicolas Avrutin <rasputin@google.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoiallocator: only adjust memory usage for up instances
Klaus Aehlig [Mon, 22 Feb 2016 12:50:20 +0000 (13:50 +0100)]
iallocator: only adjust memory usage for up instances

With the introduction of forthcoming instances, htools will
consider two memory states: the memory usage now, and the
future one, when all forthcoming one, when all instances are
created and started. In particular, it is not necessary,
and in fact harmful, to reserve the memory of currently down
instances by subtracting it from the current free memory at
the IAllocator interface. IAllocators are supposed to take future
memory usage into account in the same way as they are supposed
to ensure N+1 redundancy.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Brian Foley <bpfoley@google.com>

Cherry-picked-from: eacfdba57f24
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Brian Foley <bpfoley@google.com>

4 years agoFix failover in case the source node is offline
Dimitris Aragiorgis [Wed, 20 Jan 2016 16:37:14 +0000 (18:37 +0200)]
Fix failover in case the source node is offline

Commit ff74b60 closes instance disks on the source node before
doing a failover. In case the node is offline this is not possible.
This patch proceeds with the failover in case the source node
is offline or the --ingore-consistency flag is set. Reduce also
some config lookups for the node's name.

This fixes Issue #1162.

Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoMerge branch 'stable-2.13' into stable-2.14
Klaus Aehlig [Thu, 14 Jan 2016 17:01:17 +0000 (18:01 +0100)]
Merge branch 'stable-2.13' into stable-2.14

* stable-2.13
  Run ssh-key renewal in debug mode during upgrade

* stable-2.12
  Increase minimal sizes of test online nodes
  Also log the high-level upgrade steps
  Add function to provide logged user feedback
  Run renew-crypto in upgrades in debug mode
  Unconditionally log upgrades at debug level
  Document healthy-majority restriction on master-failover
  Check for healthy majority on master failover with voting
  Add a predicate testing that a majority of nodes is healthy
  Fix outdated comment
  Pass arguments to correct daemons during master-failover
  Fix documentation for master-failover

* stable-2.11
  (no changes)

* stable-2.10
  KVM: explicitly configure routed NICs late

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoRun ssh-key renewal in debug mode during upgrade
Klaus Aehlig [Thu, 14 Jan 2016 14:10:01 +0000 (15:10 +0100)]
Run ssh-key renewal in debug mode during upgrade

As errors during an upgrade of Ganeti are harder to
understand, as two versions of Ganeti are involved,
provide more debug information for everything that happens
during that process. Note that upgrades are a rare event,
so we do not have to worry about the size of log files
too much.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoMerge branch 'stable-2.12' into stable-2.13
Klaus Aehlig [Thu, 14 Jan 2016 13:07:02 +0000 (14:07 +0100)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Increase minimal sizes of test online nodes
  Also log the high-level upgrade steps
  Add function to provide logged user feedback
  Run renew-crypto in upgrades in debug mode
  Unconditionally log upgrades at debug level
  Document healthy-majority restriction on master-failover
  Check for healthy majority on master failover with voting
  Add a predicate testing that a majority of nodes is healthy
  Fix outdated comment
  Pass arguments to correct daemons during master-failover
  Fix documentation for master-failover

* stable-2.11
  (no changes)

* stable-2.10
  KVM: explicitly configure routed NICs late

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoIncrease minimal sizes of test online nodes stable-2.12
Klaus Aehlig [Tue, 8 Dec 2015 16:05:10 +0000 (17:05 +0100)]
Increase minimal sizes of test online nodes

A lot of our tests work by generating a node and a
strictly smaller instance and then continue under
the assumption that the instance will fit on the node.
To obtain a strictly smaller instance, we take an instance
of size at most half the free resources of the node. The
problem with this approach is that we also require minimal
resources of an instance (for examples to be realistic); now,
this can lead to an upper bound lower than the lower bound
and, by the way QuickCheck's `choose` works, still a value
between these bounds is chosen, violating the assumptions
about node and instance sizes.

To avoid those problems, set the minimal resources of an
allocatable node so that half of them is still bigger than
the minimal resources of an instance.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

Cherry-picked-from: 6ccf05c1507c58e
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoMerge branch 'stable-2.11' into stable-2.12
Klaus Aehlig [Tue, 12 Jan 2016 15:46:43 +0000 (16:46 +0100)]
Merge branch 'stable-2.11' into stable-2.12

* stable-2.11
  (no changes)

* stable-2.10
  KVM: explicitly configure routed NICs late

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoAlso log the high-level upgrade steps
Klaus Aehlig [Tue, 12 Jan 2016 10:37:13 +0000 (11:37 +0100)]
Also log the high-level upgrade steps

The upgrade of a Ganeti cluster is done in several
high-level steps ("Draining queue", "Pausing the watcher",
"Stopping daemons", ...). Log those headings as well in
order to simplify reading the log file; with these headings,
it is more easy to understand which goal is aimed for with
all the micro-step RunCmd log entries.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoAdd function to provide logged user feedback
Klaus Aehlig [Tue, 12 Jan 2016 14:54:33 +0000 (15:54 +0100)]
Add function to provide logged user feedback

Add a utility function that provides feedback to the
user on stdout that is additionally logged (at INFO level)
in the log file.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoRun renew-crypto in upgrades in debug mode
Klaus Aehlig [Mon, 11 Jan 2016 17:11:43 +0000 (18:11 +0100)]
Run renew-crypto in upgrades in debug mode

As errors during an upgrade of Ganeti are harder to
understand, as two versions of Ganeti are involved,
provide more debug information for everything that happens
during that process. Note that upgrades are a rare event,
so we do not have to worry about the size of log files
too much.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoUnconditionally log upgrades at debug level
Klaus Aehlig [Tue, 12 Jan 2016 09:38:50 +0000 (10:38 +0100)]
Unconditionally log upgrades at debug level

Cluster upgrades to a new minor version of Ganeti are a rare
operation (in fact, new minor versions are released only every
3 months). Therefore, we do not have to worry about increased
size of log files. However, upgrades of Ganeti are complicated
in the sense that, should something break during the upgrade, it
is not immediately obvious, in which Ganeti state is left in. Therefore,
always provide full log information on upgrades. Fixes issue 1137.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoMerge branch 'stable-2.10' into stable-2.11 stable-2.11
Klaus Aehlig [Mon, 11 Jan 2016 11:30:30 +0000 (12:30 +0100)]
Merge branch 'stable-2.10' into stable-2.11

* stable-2.10
  KVM: explicitly configure routed NICs late

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoTest disk attachment with different primary nodes
Lisa Velden [Mon, 11 Jan 2016 11:12:49 +0000 (12:12 +0100)]
Test disk attachment with different primary nodes

Test a detach/attach sequence with a DRBD disk and a different primary
node for the disk and the instance. This should raise an exception.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoCheck for same primary node before disk attachment
Lisa Velden [Mon, 11 Jan 2016 11:09:47 +0000 (12:09 +0100)]
Check for same primary node before disk attachment

Make sure a DRBD disk has the same primary node as the instance where it
will be attached to.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoDocument healthy-majority restriction on master-failover
Klaus Aehlig [Fri, 8 Jan 2016 13:51:20 +0000 (14:51 +0100)]
Document healthy-majority restriction on master-failover

The previous patch introduced a behavioral change for master-failover:
it is rejected unless a majority of nodes is healthy or the --no-voting
option is given. (While we in general do not change behavior on a stable
branch, rejecting an operation that can be retried with different command-line
options is better than breaking the cluster completely.) Document this.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoCheck for healthy majority on master failover with voting
Klaus Aehlig [Fri, 8 Jan 2016 10:37:17 +0000 (11:37 +0100)]
Check for healthy majority on master failover with voting

The normal procedure for a master failover is that, after telling
each node the new master, the daemons on the new master node are
started the standard way, i.e., with voting. This, however, requires
that a majority of nodes is still healthy; otherwise, the failover
will result in the daemons not starting and thus a broken cluster.
Therefore, reject master failovers with voting, unless we can verify
that a majority of nodes is still responding.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoAdd a predicate testing that a majority of nodes is healthy
Klaus Aehlig [Fri, 8 Jan 2016 11:26:57 +0000 (12:26 +0100)]
Add a predicate testing that a majority of nodes is healthy

For standard master failover (with voting), it is necessary
that the majority of nodes is still reachable and can answer
questions about which node is master. Add a predicate verifying
that this is still true.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoFix outdated comment
Klaus Aehlig [Fri, 8 Jan 2016 10:54:47 +0000 (11:54 +0100)]
Fix outdated comment

Commit 5e641d0a introduced also counting the vote of
the node itself. Adapt the parameter description
accordingly.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoKVM: explicitly configure routed NICs late stable-2.10
Apollon Oikonomopoulos [Wed, 2 Dec 2015 12:35:42 +0000 (14:35 +0200)]
KVM: explicitly configure routed NICs late

Commit cc8a8ed7 outlined the reasons for configuring bridged NICs early
during live migration and routed NICs after migration has been finished.
Back then these were the only types of NICs available, however with the
introduction of OVS support this has changed.

Since OVS bridges are essentially bridges, the considerations outlined
in cc8a8ed7 still apply: in particular, we do not want to lose the
gratuitous ARP sent out by the KVM NICs, so we have to configure
the OVS interfaces early in the migration process as well.

Rather than explicitly configure bridged and OVS interfaces early, we
prefer to explicitly configure routed interfaces late, since this leads
to more compact code.

Signed-off-by: Apollon Oikonomopoulos <apoikos@gmail.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoPass arguments to correct daemons during master-failover
Hrvoje Ribicic [Thu, 17 Dec 2015 00:18:50 +0000 (00:18 +0000)]
Pass arguments to correct daemons during master-failover

A master-failover can be executed with the --no-voting flag, making
Ganeti start daemons despite a lack of votes. This is necessary to
fail over a cluster reduced to two nodes. The feature has not
been working since 2.12 daemon refactoring, as the daemon parameters
were passed through environmental variables that were not updated.

This commit passes the parameters correctly, and fixes issue 1159.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoFix documentation for master-failover
Hrvoje Ribicic [Mon, 4 Jan 2016 13:16:45 +0000 (14:16 +0100)]
Fix documentation for master-failover

The gnt-cluster manual still specified that arguments should be passed
to the master daemon - one which no longer exists. This patch specifies
the two new daemons to which arguments should be passed instead.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoAdd detach/attach sequence test
Lisa Velden [Wed, 16 Dec 2015 15:27:44 +0000 (16:27 +0100)]
Add detach/attach sequence test

Add a test for external storage.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoAllow disk attachment with external storage
Lisa Velden [Wed, 16 Dec 2015 13:57:43 +0000 (14:57 +0100)]
Allow disk attachment with external storage

As external storage is not associated with a node, we have to make an
exception for that before raising an error.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoRevision bump for 2.14.2 v2.14.2
Hrvoje Ribicic [Tue, 15 Dec 2015 17:54:17 +0000 (18:54 +0100)]
Revision bump for 2.14.2

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoUpdate NEWS file for 2.14.2
Hrvoje Ribicic [Tue, 15 Dec 2015 17:53:11 +0000 (18:53 +0100)]
Update NEWS file for 2.14.2

With the security issues text and a list of minor issues.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoMerge branch 'stable-2.13' into stable-2.14
Hrvoje Ribicic [Tue, 15 Dec 2015 14:44:16 +0000 (15:44 +0100)]
Merge branch 'stable-2.13' into stable-2.14

* stable-2.13
  Revision bump for 2.13.3
  Update NEWS file for 2.13.3

* stable-2.12
  Bump revision number for 2.12.6
  Update NEWS file for 2.12.6

* stable-2.11
  Revision bump for 2.11.8
  Update NEWS file for 2.11.8

* stable-2.10
  Version bump for 2.10.8
  Update NEWS file for 2.10.8

* stable-2.9
  Bump revision number
  Update NEWS file for 2.9.7 release
  Improve RAPI section on security

Conflicts:
  NEWS - Merged entries
  configure.ac - Took 2.14 version numbers

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoRevision bump for 2.13.3 v2.13.3
Hrvoje Ribicic [Mon, 14 Dec 2015 18:00:43 +0000 (19:00 +0100)]
Revision bump for 2.13.3

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoUpdate NEWS file for 2.13.3
Hrvoje Ribicic [Mon, 14 Dec 2015 17:59:26 +0000 (18:59 +0100)]
Update NEWS file for 2.13.3

With the security issues text and a list of minor issues.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoMerge branch 'stable-2.12' into stable-2.13
Hrvoje Ribicic [Mon, 14 Dec 2015 17:33:14 +0000 (18:33 +0100)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Bump revision number for 2.12.6
  Update NEWS file for 2.12.6

* stable-2.11
  Revision bump for 2.11.8
  Update NEWS file for 2.11.8

* stable-2.10
  Version bump for 2.10.8
  Update NEWS file for 2.10.8

* stable-2.9
  Bump revision number
  Update NEWS file for 2.9.7 release
  Improve RAPI section on security

Conflicts:
  NEWS - Merge entries
  configure.ac - Take 2.13 version numbers

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoBump revision number for 2.12.6 v2.12.6
Hrvoje Ribicic [Mon, 14 Dec 2015 16:42:03 +0000 (17:42 +0100)]
Bump revision number for 2.12.6

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoUpdate NEWS file for 2.12.6
Hrvoje Ribicic [Mon, 14 Dec 2015 16:41:09 +0000 (17:41 +0100)]
Update NEWS file for 2.12.6

With the security issues text and a list of minor issues.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoMerge branch 'stable-2.11' into stable-2.12
Hrvoje Ribicic [Mon, 14 Dec 2015 16:15:14 +0000 (17:15 +0100)]
Merge branch 'stable-2.11' into stable-2.12

* stable-2.11
  Revision bump for 2.11.8
  Update NEWS file for 2.11.8

* stable-2.10
  Version bump for 2.10.8
  Update NEWS file for 2.10.8

* stable-2.9
  Bump revision number
  Update NEWS file for 2.9.7 release
  Improve RAPI section on security

Conflicts:
  NEWS - Merged entries
  configure.ac - Took 2.12 version numbers

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoRevision bump for 2.11.8 v2.11.8
Hrvoje Ribicic [Mon, 14 Dec 2015 14:07:23 +0000 (15:07 +0100)]
Revision bump for 2.11.8

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoUpdate NEWS file for 2.11.8
Hrvoje Ribicic [Mon, 14 Dec 2015 14:06:50 +0000 (15:06 +0100)]
Update NEWS file for 2.11.8

With the security issues text and a list of minor issues.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMerge branch 'stable-2.10' into stable-2.11
Hrvoje Ribicic [Mon, 14 Dec 2015 13:13:03 +0000 (14:13 +0100)]
Merge branch 'stable-2.10' into stable-2.11

* stable-2.10
  Version bump for 2.10.8
  Update NEWS file for 2.10.8

* stable-2.9
  Bump revision number
  Update NEWS file for 2.9.7 release
  Improve RAPI section on security

Conflicts:
  NEWS - Combine NEWS entries from both versions
  configure.ac - Take correct version numbers

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoVersion bump for 2.10.8 v2.10.8
Hrvoje Ribicic [Fri, 11 Dec 2015 11:09:21 +0000 (12:09 +0100)]
Version bump for 2.10.8

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoUpdate NEWS file for 2.10.8
Hrvoje Ribicic [Fri, 11 Dec 2015 11:08:22 +0000 (12:08 +0100)]
Update NEWS file for 2.10.8

With the security issues text and list minor issues.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoMerge branch 'stable-2.9' into stable-2.10
Hrvoje Ribicic [Thu, 10 Dec 2015 18:04:48 +0000 (19:04 +0100)]
Merge branch 'stable-2.9' into stable-2.10

* stable-2.9
  Bump revision number
  Update NEWS file for 2.9.7 release
  Improve RAPI section on security

Conflicts:
  NEWS - leave 2.9.7 info in
  configure.ac - revert version bump

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoBump revision number stable-2.9 v2.9.7
Hrvoje Ribicic [Thu, 10 Dec 2015 16:40:51 +0000 (17:40 +0100)]
Bump revision number

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoUpdate NEWS file for 2.9.7 release
Hrvoje Ribicic [Thu, 10 Dec 2015 16:39:53 +0000 (17:39 +0100)]
Update NEWS file for 2.9.7 release

... with security release info and minor changes.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoImprove RAPI section on security
Hrvoje Ribicic [Thu, 10 Dec 2015 13:22:01 +0000 (14:22 +0100)]
Improve RAPI section on security

The RAPI section on security has been improved with new information
related on how users can lock RAPI down as they see fit, and what are
the risks involved with default settings.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMerge branch 'stable-2.13' into stable-2.14
Hrvoje Ribicic [Thu, 3 Dec 2015 22:55:20 +0000 (22:55 +0000)]
Merge branch 'stable-2.13' into stable-2.14

* stable-2.13
  (no changes)

* stable-2.12
  Restrict showing of DRBD secret using types
  Calculate correct affected nodes set in InstanceChangeGroup

* stable-2.11
  (no changes)

* stable-2.10
  (no changes)

* stable-2.9
  QA: Ensure the DRBD secret is not retrievable via RAPI
  Redact the DRBD secret in instance queries
  Do not attempt to use the DRBD secret in gnt-instance info

Conflicts:
  src/Ganeti/Objects.hs - Followed code to Disk.hs
  test/hs/Test/Ganeti/Objects.hs - Added Private to disk definition

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoMerge branch 'stable-2.12' into stable-2.13
Hrvoje Ribicic [Thu, 3 Dec 2015 21:13:39 +0000 (21:13 +0000)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Restrict showing of DRBD secret using types
  Calculate correct affected nodes set in InstanceChangeGroup

* stable-2.11
  (no changes)

* stable-2.10
  (no changes)

* stable-2.9
  QA: Ensure the DRBD secret is not retrievable via RAPI
  Redact the DRBD secret in instance queries
  Do not attempt to use the DRBD secret in gnt-instance info

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoRestrict showing of DRBD secret using types
Hrvoje Ribicic [Tue, 1 Dec 2015 16:11:38 +0000 (16:11 +0000)]
Restrict showing of DRBD secret using types

While the Python changes from 2.9 do prevent Ganeti from accidentally
revealing the Haskell secret, they may not do so forever. The queries
are planned to switch from Python to Haskell at some point, and should
someone want to use the DRBD secret, they can do so easily.

As a more elegant way of hiding the secret, wrap it in a Private
wrapper, preventing it from leaking out unless explicitly requested.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoMerge branch 'stable-2.11' into stable-2.12
Hrvoje Ribicic [Tue, 1 Dec 2015 15:57:49 +0000 (15:57 +0000)]
Merge branch 'stable-2.11' into stable-2.12

* stable-2.11
  (no changes)

* stable-2.10
  (no changes)

* stable-2.9
  QA: Ensure the DRBD secret is not retrievable via RAPI
  Redact the DRBD secret in instance queries
  Do not attempt to use the DRBD secret in gnt-instance info

Conflicts:
  lib/client/gnt_instance.py - taken the 2.11 version, with explicit
                               parameter use
  qa/qa_rapi.py - merged imports, resolved trivial conflict

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoMerge branch 'stable-2.10' into stable-2.11
Hrvoje Ribicic [Mon, 30 Nov 2015 16:12:42 +0000 (17:12 +0100)]
Merge branch 'stable-2.10' into stable-2.11

* stable-2.10
  (no changes)

* stable-2.9
  QA: Ensure the DRBD secret is not retrievable via RAPI
  Redact the DRBD secret in instance queries
  Do not attempt to use the DRBD secret in gnt-instance info

Conflicts:
  qa/qa_rapi.py - simply append new changes

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoMerge branch 'stable-2.9' into stable-2.10
Hrvoje Ribicic [Mon, 30 Nov 2015 15:49:09 +0000 (16:49 +0100)]
Merge branch 'stable-2.9' into stable-2.10

* stable-2.9
  QA: Ensure the DRBD secret is not retrievable via RAPI
  Redact the DRBD secret in instance queries
  Do not attempt to use the DRBD secret in gnt-instance info

Conflicts:
  lib/cmdlib/instance_query.py - removed physical_id changes

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoQA: Ensure the DRBD secret is not retrievable via RAPI
Hrvoje Ribicic [Fri, 27 Nov 2015 17:32:42 +0000 (17:32 +0000)]
QA: Ensure the DRBD secret is not retrievable via RAPI

The best way to ensure that the DRBD secret does not inadvertently leak
is to introduce a QA test examining the output of the interface in
which the leak was originally introduced.

The test added determines the DRBD secret and makes RAPI requests,
examining them for its presence and failing if a match is found.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoRedact the DRBD secret in instance queries
Hrvoje Ribicic [Fri, 27 Nov 2015 15:58:13 +0000 (15:58 +0000)]
Redact the DRBD secret in instance queries

As the DRBD secret should be used only by Ganeti internals, replacing
the actual secret with None does not hamper Ganeti's work, while
preventing the secret from being leaked.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoDo not attempt to use the DRBD secret in gnt-instance info
Hrvoje Ribicic [Fri, 21 Aug 2015 19:46:18 +0000 (19:46 +0000)]
Do not attempt to use the DRBD secret in gnt-instance info

... so just redact what is output.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoFix lines with more than 80 characters
Lisa Velden [Fri, 27 Nov 2015 10:25:55 +0000 (11:25 +0100)]
Fix lines with more than 80 characters

Previous refactoring has introduced lines with too many characters.
This patch fixes this.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoAdd more detach/attach sequence tests
Lisa Velden [Wed, 25 Nov 2015 16:57:18 +0000 (17:57 +0100)]
Add more detach/attach sequence tests

Test detach/attach sequences with an instance that becomes diskless
after detaching its disk and also test detach/attach with drbd disks.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoAllow disk attachment to diskless instances
Lisa Velden [Wed, 25 Nov 2015 15:00:45 +0000 (16:00 +0100)]
Allow disk attachment to diskless instances

As only DRBD disks can be associated to more nodes than the instance
where we want to attach the disk to, we have to change the check for
associated nodes, too.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoImprove tests for attaching disks
Lisa Velden [Wed, 25 Nov 2015 13:53:39 +0000 (14:53 +0100)]
Improve tests for attaching disks

by associating disks and instances to a specific node.
Also refactor mock uuids and mock disk names into variables.

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoCalculate correct affected nodes set in InstanceChangeGroup
Oleg Ponomarev [Fri, 20 Nov 2015 20:45:11 +0000 (21:45 +0100)]
Calculate correct affected nodes set in InstanceChangeGroup

This is the fix for the issue 1144. The nodes affected by the
InstanceChangeGroup logical unit were calculated incorrectly and that
broke 'gnt-instance change-group --to' operation. This patch fixes it.

Signed-off-by: Oleg Ponomarev <oponomarev@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMerge branch 'stable-2.13' into stable-2.14
Oleg Ponomarev [Wed, 11 Nov 2015 18:01:36 +0000 (19:01 +0100)]
Merge branch 'stable-2.13' into stable-2.14

* stable-2.13
  Extend timeout for gnt-cluster renew-crypto

* stable-2.12
  Revert "Also consider connection time out a network error"
  Clone lists before modifying
  Make lockConfig call retryable

* stable-2.11
  (no changes)

* stable-2.10
  Remove -X from hspace man page
  Make htools tolerate missing "dtotal" and "dfree" on luxi

Conflicts:
    tools/cfgupgrade
Resolution
    take the change into lib/tools/cfgupgrade

Signed-off-by: Oleg Ponomarev <oponomarev@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoMerge branch 'stable-2.12' into stable-2.13
Oleg Ponomarev [Wed, 11 Nov 2015 17:14:51 +0000 (18:14 +0100)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Revert "Also consider connection time out a network error"
  Clone lists before modifying
  Make lockConfig call retryable

* stable-2.11
  (no changes)

* stable-2.10
  Remove -X from hspace man page
  Make htools tolerate missing "dtotal" and "dfree" on luxi

Signed-off-by: Oleg Ponomarev <oponomarev@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoMerge branch 'stable-2.11' into stable-2.12
Oleg Ponomarev [Wed, 11 Nov 2015 16:04:40 +0000 (17:04 +0100)]
Merge branch 'stable-2.11' into stable-2.12

    * stable-2.11
      (no changes)

    * stable-2.10
      Remove -X from hspace man page
      Make htools tolerate missing "dtotal" and "dfree" on luxi

Signed-off-by: Oleg Ponomarev <oponomarev@google.com>
Reviewed-by: Liza Velden <velden@google.com>

4 years agoMerge branch 'stable-2.10' into stable-2.11
Klaus Aehlig [Wed, 11 Nov 2015 15:51:42 +0000 (16:51 +0100)]
Merge branch 'stable-2.10' into stable-2.11

* stable-2.10
  Remove -X from hspace man page
  Make htools tolerate missing "dtotal" and "dfree" on luxi

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoRevert "Also consider connection time out a network error"
Klaus Aehlig [Tue, 10 Nov 2015 16:47:44 +0000 (17:47 +0100)]
Revert "Also consider connection time out a network error"

This reverts commit 84c17185ad47070944c64ab64a8c7dfd60a260f9.
We use RetryOnNetworkError for basically every form of internal
communication. While it makes sense to retry---given that we
assume daemons might come and go at any time---we can only do
so safely, if we positively know that we did not cause any
side effect. Given that not all our requests are idempotent
(e.g., submitting jobs is not)---in fact, the majority is
not--, retrying on timeouts is not safe.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoClone lists before modifying
Klaus Aehlig [Tue, 10 Nov 2015 15:40:47 +0000 (16:40 +0100)]
Clone lists before modifying

When an opcode expands to a list of jobs, we extend the reason trail
of the new jobs with that of the original opcode that expanded to them.
Before modifying the reason trail, however, we should duplicate it to
avoid side effects on shared copies---like the default empty list.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMake lockConfig call retryable
Klaus Aehlig [Wed, 4 Nov 2015 13:52:16 +0000 (14:52 +0100)]
Make lockConfig call retryable

Locking the configuration is naturally idempotent. However,
the corresponding WConfD call had a check refusing to lock
the config, if the caller has already locked it, arguing that
this should not happen. That argument misses that we have the
built-in assumption that daemons might be restarted at any time,
including the moment where a request is processed, but the caller
did not get the answer yet. So allow retries, hower logging that
they occurred (as this should only happen rarely).

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoExtend timeout for gnt-cluster renew-crypto
Hrvoje Ribicic [Wed, 4 Nov 2015 13:01:38 +0000 (14:01 +0100)]
Extend timeout for gnt-cluster renew-crypto

With particularly large clusters, the renewal of SSH keys happening in
renew-crypto can take a long time to complete. While this should be
improved, an additional problem is that the RPC doing most of the work
has a default one-hour timeout. Given that it is preferable that the
operation completes, this patch bumps the timeout to four hours, which
should suffice even for 80+ node clusters.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoMerge branch 'stable-2.12' into stable-2.13
Hrvoje Ribicic [Mon, 2 Nov 2015 17:49:36 +0000 (17:49 +0000)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Return the correct error code in the post-upgrade script
  Make openssl refrain from DH altogether
  Fix upgrades of instances with missing creation time

Conflicts:
cfgupgrade_unittest.py: merge version tests
tools/post-upgrade: return the correct error code for SSH
                    renewal as well

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoReturn the correct error code in the post-upgrade script
Hrvoje Ribicic [Mon, 2 Nov 2015 17:19:22 +0000 (17:19 +0000)]
Return the correct error code in the post-upgrade script

While we want all the post-upgrade actions to be undertaken, should one
of these fail, the correct error code should be returned so that the
upgrade script can report issues.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoMake openssl refrain from DH altogether
Klaus Aehlig [Mon, 2 Nov 2015 10:44:34 +0000 (11:44 +0100)]
Make openssl refrain from DH altogether

As various ssl implementations have different ideas about
which dh key lengths are acceptable, refrain from standard
dh altogether (and not only from anonymous dh) to avoid
handshake problems.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoRemove -X from hspace man page
Klaus Aehlig [Mon, 26 Oct 2015 12:34:17 +0000 (13:34 +0100)]
Remove -X from hspace man page

hspace never had such an option.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

Cherry-picked-from: fa36daf4
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoFix faulty iallocator type check
Hrvoje Ribicic [Wed, 28 Oct 2015 17:56:23 +0000 (17:56 +0000)]
Fix faulty iallocator type check

Because the ignore-soft-errors parameter is optional rather than always
present, fix the type check in the iallocator request issuing code.

Signed-off-by: Gerard Oskamp <gjo@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoImprove cfgupgrade output in case of errors
Hrvoje Ribicic [Wed, 28 Oct 2015 14:21:06 +0000 (15:21 +0100)]
Improve cfgupgrade output in case of errors

By logging with the exception function instead of the error function,
and showing the error content without the stack trace unless explicitly
debugging.

Signed-off-by: Gerard Oskamp <gjo@google.com>
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoFix upgrades of instances with missing creation time
Hrvoje Ribicic [Tue, 27 Oct 2015 18:38:16 +0000 (18:38 +0000)]
Fix upgrades of instances with missing creation time

Some instances from very old Ganeti versions may not have any creation
time information embedded in the config. The upgrade code does not
expect this, and crashes horribly when trying to populate newly
separate disk objects with the same creation time, and this patch
fixes things by inserting a fake value: 0.

The value was chosen because the serialization and deserialization of
such an instance in Haskell yields a value of 0 for the ctime, making
the time consistent between instance and disk. While showing the epoch
time instead of N/A in gnt-instance info is suboptimal, due to the age
of the Ganeti version in which these instances must have been created,
they are at least still ordered correctly.

Signed-off-by: Gerard Oskamp <gjo@google.com>
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoReduce flakyness of GetCmdline test on slow machines
Klaus Aehlig [Wed, 28 Oct 2015 10:54:18 +0000 (11:54 +0100)]
Reduce flakyness of GetCmdline test on slow machines

The GetCmdline test verifies that we can get the command line
of a running process via the procfs. To not have to care about
cleanup, the test creates an ephemeral process for this. While
two wall-clock seconds seem more than enough for a single read
from the procfs on nowadays machines, this is not true for some
of the public buildbot (virtual) machines which are extremely
low on resources and can have really heavy load; this causes
flakyness of that test there. Mitigate this by increasing the
life time of the process.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoRemove duplicated words
Lisa Velden [Tue, 27 Oct 2015 15:43:13 +0000 (16:43 +0100)]
Remove duplicated words

Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoMerge branch 'stable-2.13' into stable-2.14
Klaus Aehlig [Fri, 23 Oct 2015 07:52:51 +0000 (09:52 +0200)]
Merge branch 'stable-2.13' into stable-2.14

* stable-2.13
  Renew-crypto: stop daemons on master node first
  Mention manual creation of {shared,}file paths in UPGRADE
  Don't warn about broken SSH setup of offline nodes

* stable-2.12
  Fix inconsistency in python and haskell objects
  Add notSerializeDefault default field option
  Move design-disks.rst to drafts

* stable-2.11
  Fix default for --default-iallocator-params

Conflicts:
src/Ganeti/THH.hs
Resolution:
take all additions

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoMake htools tolerate missing "dtotal" and "dfree" on luxi
Klaus Aehlig [Tue, 16 Jun 2015 09:15:48 +0000 (11:15 +0200)]
Make htools tolerate missing "dtotal" and "dfree" on luxi

If a cluster allows sharedfile as only disk template, the amount of
total and free disk space might not be available. This is perfectly
normal, hence make the luxi backend handle it gracefully and just report
0 available disk on 0 total disk.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>

Cherry-picked-from: 49644203
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoMerge branch 'stable-2.12' into stable-2.13
Klaus Aehlig [Thu, 22 Oct 2015 08:51:36 +0000 (10:51 +0200)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  Fix inconsistency in python and haskell objects
  Add notSerializeDefault default field option
  Move design-disks.rst to drafts

* stable-2.11
  Fix default for --default-iallocator-params

Conflicts:
doc/design-draft.rst
doc/index.rst
lib/cli.py

Resolution:
for lib/cli.py follow the code move
for the rest, take all additions.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoMerge branch 'stable-2.11' into stable-2.12
Klaus Aehlig [Thu, 22 Oct 2015 07:13:23 +0000 (09:13 +0200)]
Merge branch 'stable-2.11' into stable-2.12

* stable-2.11
  Fix default for --default-iallocator-params

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoFix default for --default-iallocator-params
Klaus Aehlig [Wed, 21 Oct 2015 15:36:23 +0000 (17:36 +0200)]
Fix default for --default-iallocator-params

We need to distinguish between the option not being provided
(i.e., no change requested) and the option being empty (i.e.,
a request to reset the value). Therefore, use None as a default,
not {}.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>

4 years agoRenew-crypto: stop daemons on master node first
Helga Velroyen [Wed, 21 Oct 2015 10:51:37 +0000 (12:51 +0200)]
Renew-crypto: stop daemons on master node first

Otherwise, this can create problems when restarting
the nodes due to voting issues.

Signed-off-by: Gerard Oskamp <gjo@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>

4 years agoMention manual creation of {shared,}file paths in UPGRADE
Helga Velroyen [Thu, 15 Oct 2015 14:11:33 +0000 (16:11 +0200)]
Mention manual creation of {shared,}file paths in UPGRADE

This fixes Issue 653. It was unclear whether or not
'ensure-dirs' creates the directories for file and
sharedfile storage. This patch extends the documentation
to clarify this.

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>

4 years agoDon't warn about broken SSH setup of offline nodes
Helga Velroyen [Wed, 14 Oct 2015 08:24:33 +0000 (10:24 +0200)]
Don't warn about broken SSH setup of offline nodes

This fixes issue 1131. 'gnt-cluster verify' should stop
complaining about broken SSH setups of offline nodes.

Additionally, this fixes a problem when readding nodes.
In some cases, Ganeti complains about a possible attack,
which is a valid case for readding a node (if a key
renew took place between offlining and readding the node).

Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoFix inconsistency in python and haskell objects
Oleg Ponomarev [Mon, 12 Oct 2015 14:25:33 +0000 (16:25 +0200)]
Fix inconsistency in python and haskell objects

Currently hv/disk_state_static parameters are supported only for cluster
object properly. For node groups and nodes they were introduced in
2da9f556, however only on the python side. This could cause problems
during upgrades from old versions.

This patch adds hv and disk states fields to haskell objects as a
notSerializedDefaultField which will fix the problem without the changes
in behaviour. Also it modifies corresponding haskell arbitrary instances.

The patch is inspired by e78fb0d6 and 553363a3.

Signed-off-by: Oleg Ponomarev <oponomarev@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoAdd notSerializeDefault default field option
Oleg Ponomarev [Mon, 12 Oct 2015 14:25:32 +0000 (16:25 +0200)]
Add notSerializeDefault default field option

Default field with notSerializedDefault flag set is a default field which
will be serialized only if it's value differs from the default one. This
flag can be set by using notSerializedDefaultField field type instead of
defaultField field type.

This field is introduced in order to fix a bug of inconsistency between
haskell and python config modules which leads to inconsistent config
after ganeti updgrade.

Signed-off-by: Oleg Ponomarev <oponomarev@google.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

Cherry-picked from: c0a2c62b9ad96c3e35cae0ffdcdf63a09164f537

Signed-off-by: Oleg Ponomarev <oponomarev@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4 years agoMerge branch 'stable-2.13' into stable-2.14
Klaus Aehlig [Mon, 12 Oct 2015 12:54:00 +0000 (14:54 +0200)]
Merge branch 'stable-2.13' into stable-2.14

* stable-2.13
  Improve xl socat migrations

* stable-2.12
  QA: Retrieve only the RAPI certificate
  QA: Allow usage of specific RAPI certificates and files
  QA: Reload certificates only when renew-crypto has been run
  QA: Restart Ganeti after adding the RAPI users file
  QA: Add reading the RAPI password from a file
  QA: Allow the RAPI user to be set
  QA: Do not remove nodes from cluster without destroying it
  QA: Refactor RAPI handling
  Increase default disk size of burnin to 1G
  break line with more than 80 characters
  Only search for Python-2 interpreters
  Fix faulty comments / indentation
  Handle Xen 4.3 states better

* stable-2.11
  (no changes)

* stable-2.10
  Add a test for parsing of admin_state in IAlloc backend
  At IAlloc backend guess state from admin state

* stable-2.9
  Update harep's man page to notify users of its limitations

Conflicts:
src/Ganeti/HTools/Backend/IAlloc.hs
Resolution:
manually apply 9c1704a5 to stable-2.13

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMove design-disks.rst to drafts
Klaus Aehlig [Mon, 12 Oct 2015 12:15:07 +0000 (14:15 +0200)]
Move design-disks.rst to drafts

When, in commit 2676f31, the design for stand-alone disks
was added, it was not added to the list of draft designs,
but accidentally to the list of designs not shown in the index;
the latter, however, is only for implemented designs. As this
design still isn't fully implemented, fix this now.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMerge branch 'stable-2.12' into stable-2.13
Klaus Aehlig [Fri, 9 Oct 2015 09:04:28 +0000 (11:04 +0200)]
Merge branch 'stable-2.12' into stable-2.13

* stable-2.12
  QA: Retrieve only the RAPI certificate
  QA: Allow usage of specific RAPI certificates and files
  QA: Reload certificates only when renew-crypto has been run
  QA: Restart Ganeti after adding the RAPI users file
  QA: Add reading the RAPI password from a file
  QA: Allow the RAPI user to be set
  QA: Do not remove nodes from cluster without destroying it
  QA: Refactor RAPI handling
  Increase default disk size of burnin to 1G
  break line with more than 80 characters
  Only search for Python-2 interpreters
  Fix faulty comments / indentation
  Handle Xen 4.3 states better

* stable-2.11
  (no changes)

* stable-2.10
  Add a test for parsing of admin_state in IAlloc backend
  At IAlloc backend guess state from admin state

* stable-2.9
  Update harep's man page to notify users of its limitations

Conflicts:
qa/qa_cluster.py: trivial
qa/rapi-workload.py: keep removed (see c0065e0fa1730a477)

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMerge branch 'stable-2.11' into stable-2.12
Klaus Aehlig [Thu, 8 Oct 2015 14:35:35 +0000 (16:35 +0200)]
Merge branch 'stable-2.11' into stable-2.12

* stable-2.11
  (no changes)

* stable-2.10
  Add a test for parsing of admin_state in IAlloc backend
  At IAlloc backend guess state from admin state

* stable-2.9
  Update harep's man page to notify users of its limitations

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMerge branch 'stable-2.10' into stable-2.11
Klaus Aehlig [Thu, 8 Oct 2015 14:16:53 +0000 (16:16 +0200)]
Merge branch 'stable-2.10' into stable-2.11

* stable-2.10
  Add a test for parsing of admin_state in IAlloc backend
  At IAlloc backend guess state from admin state

* stable-2.9
  Update harep's man page to notify users of its limitations

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoQA: Retrieve only the RAPI certificate
Hrvoje Ribicic [Sun, 27 Sep 2015 21:55:51 +0000 (21:55 +0000)]
QA: Retrieve only the RAPI certificate

The QA previously took in the entire certificate file, along with the
private key. As this is really not necessary, change it to be more
conservative.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoQA: Allow usage of specific RAPI certificates and files
Hrvoje Ribicic [Wed, 23 Sep 2015 14:38:50 +0000 (16:38 +0200)]
QA: Allow usage of specific RAPI certificates and files

In some situations, we want to make sure the QA runs with a certain set
of certificates, secrets, users, and the like. This patch allows the QA
to look for a directory on the master node where all of these can be
found, and transplant them into the right place. This allow cluster
creation, renew-crypto, or any other cert-affecting operation to be
tested while preserving RAPI access.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoQA: Reload certificates only when renew-crypto has been run
Hrvoje Ribicic [Thu, 24 Sep 2015 10:36:31 +0000 (12:36 +0200)]
QA: Reload certificates only when renew-crypto has been run

When the cluster refreshes the RAPI certificate as it does in the
renew-crypto test, the stored certificate in the curl config of the
RAPI client has to be renewed. But it should only be renewed when the
test is enabled, so this patch moves that code into the test.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoQA: Restart Ganeti after adding the RAPI users file
Hrvoje Ribicic [Thu, 24 Sep 2015 21:20:30 +0000 (23:20 +0200)]
QA: Restart Ganeti after adding the RAPI users file

... otherwise we have no guarantee that the RAPI daemon will pick up
the change.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoQA: Add reading the RAPI password from a file
Hrvoje Ribicic [Tue, 22 Sep 2015 17:14:50 +0000 (19:14 +0200)]
QA: Add reading the RAPI password from a file

For situations where we're running the QA against a cluster which uses
a hashed password for access, it can be useful to be able to read the
password from a local file. This patch allows this to happen, throwing
in a few refactorings along the way.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoQA: Allow the RAPI user to be set
Hrvoje Ribicic [Tue, 22 Sep 2015 17:09:06 +0000 (19:09 +0200)]
QA: Allow the RAPI user to be set

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoQA: Do not remove nodes from cluster without destroying it
Hrvoje Ribicic [Tue, 22 Sep 2015 15:20:46 +0000 (17:20 +0200)]
QA: Do not remove nodes from cluster without destroying it

The Ganeti QA can be set up to optionally both create and destroy a
cluster during its runtime. Before this patch, the QA removed all the
nodes barring the master one at the end of a QA, regardless of whether
the cluster was supposed to be disassembled. This patch fixes this
behaviour and lets created clusters remain in place after a QA.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoQA: Refactor RAPI handling
Hrvoje Ribicic [Tue, 7 Jul 2015 00:49:23 +0000 (00:49 +0000)]
QA: Refactor RAPI handling

Since the QA RAPI code already uses the horror of global variables to
save the username and password within the qa_rapi module, the code can
be refactored to make the storage of these values outside the module
unnecessary. This encapsulates the RAPI functionality better, and will
allow for easier refactoring in later commits.

Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Lisa Velden <velden@google.com>

4 years agoMerge branch 'stable-2.9' into stable-2.10
Klaus Aehlig [Thu, 8 Oct 2015 13:27:59 +0000 (15:27 +0200)]
Merge branch 'stable-2.9' into stable-2.10

* stable-2.9
  Update harep's man page to notify users of its limitations

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>