Klaus Aehlig [Wed, 3 Feb 2016 11:49:52 +0000 (12:49 +0100)]
Merge branch 'stable-2.16' into stable-2.17
* stable-2.16
Update NEWS file for 2.16.0 beta2
Update NEWS file for 2.16.0 beta2
Bump version suffix to 2.16.0 beta2
Set block buffering for UDSServer
* stable-2.15
Do not add a new Inotify watchers on timer
Mock InitDrbdHelper's output in unittests
Optimise codegen for Python OpCode classes
* stable-2.14
Fix failover in case the source node is offline
Conflicts:
configure.ac: ignore suffix bump
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Viktor Bachraty [Tue, 2 Feb 2016 14:22:02 +0000 (14:22 +0000)]
Update NEWS file for 2.16.0 beta2
Update minor changes with fixes that happened after the originally
proposed release date. Update actual date of release.
Signed-off-by: Viktor Bachraty <vbachraty@google.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Klaus Aehlig [Mon, 1 Feb 2016 11:59:37 +0000 (12:59 +0100)]
Merge branch 'stable-2.15' into stable-2.16
* stable-2.15
Do not add a new Inotify watchers on timer
Mock InitDrbdHelper's output in unittests
Optimise codegen for Python OpCode classes
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Klaus Aehlig [Thu, 28 Jan 2016 18:13:00 +0000 (19:13 +0100)]
Do not add a new Inotify watchers on timer
Ganeti updates its in-memory copy of the configuration in several ways.
One of them is by using an inotify, the other is by periodically, in the
order of seconds, polling the file. On the latter, the inotify does not
have to be reinstantiated; in fact, doing so will result in actions taken
several times, once the inotify actually fires. Fix the fact, it was
reinstantiated.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Helga Velroyen [Thu, 28 Jan 2016 17:56:45 +0000 (18:56 +0100)]
Mock InitDrbdHelper's output in unittests
The output of the InitDrbdHelper function was cluttering up
the unit tests. Let's mock that output in tests.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Klaus Aehlig [Wed, 27 Jan 2016 17:38:20 +0000 (18:38 +0100)]
Fix window size in CPU collector
When determining which observations to take for computing the node load,
only keep those that happened after the beginning of the current window,
not those that happened after a window size after the beginning of the
epoch. While there, also make sure the buffer size is interpreted as
seconds, not as microseconds.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Brian Foley [Wed, 27 Jan 2016 19:23:23 +0000 (19:23 +0000)]
Optimise codegen for Python OpCode classes
This makes hs2py output types like ht.TMaybeBool instead of
ht.TMaybe(ht.TBool). The two have equivalent behaviour, but Python
creates a new callable object at runtime for each instance of the
second, because TMaybe is a higher order function.
This optimisation saves >500kB of heap for "import opcodes" for every
Ganeti Python process.
Signed-off-by: Brian Foley <bpfoley@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Viktor Bachraty [Wed, 27 Jan 2016 20:38:13 +0000 (20:38 +0000)]
Update NEWS file for 2.16.0 beta2
Update both major and minor changes since beta 1 including changes
inherited from older branches.
Signed-off-by: Viktor Bachraty <vbachraty@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Viktor Bachraty [Wed, 27 Jan 2016 20:50:23 +0000 (20:50 +0000)]
Bump version suffix to 2.16.0 beta2
Signed-off-by: Viktor Bachraty <vbachraty@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Tue, 26 Jan 2016 13:17:59 +0000 (14:17 +0100)]
Parse /proc/stat data as Integer
Values output by /proc/stat are total values of time since the last reboot,
in ticks that usually are 0.01s; so after roughly 250 CPU-days, the total
time can no longer be represented as a 32-bit signed integer. Note that this
can actually happen on a node with a high enough number of CPUs or long enough
uptime. Therefore, take Integer as data type, to avoid being hit by overflows.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Klaus Aehlig [Tue, 26 Jan 2016 14:01:09 +0000 (15:01 +0100)]
Add a parser definition to parse Integer
...which is like numberP, except returning an Integer instead
of an Int.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Klaus Aehlig [Tue, 26 Jan 2016 17:45:29 +0000 (18:45 +0100)]
Drop from the correct side
When forgetting data from the buffer because the maximal buffer size is reached,
forget the oldest value, not the newest one.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Fri, 22 Jan 2016 13:32:45 +0000 (14:32 +0100)]
Rearrange line-break to satisfy lint
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Hrvoje Ribicic [Fri, 22 Jan 2016 12:52:50 +0000 (13:52 +0100)]
Merge branch 'stable-2.15' into stable-2.16
* stable-2.15
(no changes)
* stable-2.14
Fix failover in case the source node is offline
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Hrvoje Ribicic [Fri, 22 Jan 2016 11:26:07 +0000 (12:26 +0100)]
Merge branch 'stable-2.14' into stable-2.15
* stable-2.14
Fix failover in case the source node is offline
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Klaus Aehlig [Thu, 21 Jan 2016 14:45:23 +0000 (15:45 +0100)]
Set block buffering for UDSServer
Commit
b0a7e3771bfd changed sending of JSON-encoded answers
to standard String sending. This was necessary as converting
Strings to ByteStrings, even to lazy ones, fully enforced the
String before the first Char got out of scope and could be
garbage collected. The down-side of this approach is, that
we now end up with one system call per character to be send.
The good news, however, is that the library's buffering uses
memory only a little more than a byte for a byte, so we can
afford buffering in that layer. Do so to reduce the number of
system calls.
On a, not quite realistic, test cluster, this resulted in the
time for a config-read going down by 1.5 orders of magnitude
with only small increase in residual memory.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Helga Velroyen [Tue, 19 Jan 2016 14:21:17 +0000 (15:21 +0100)]
Unit test for backend.RenewCrypto
This patch adds a unit test for the successful execution
of backend.RenewCrypto. It mostly reuses infrastructure
from the unit tests for adding and removing SSH keys.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 19 Jan 2016 14:21:05 +0000 (15:21 +0100)]
SSH testutils: GetKeyOfNode
This adds a little utility function to ask the SSH file
manager for a key of one particular node.
This patch also updates some documentation of the previous
function.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 19 Jan 2016 14:19:02 +0000 (15:19 +0100)]
Make backend.RenewCrypto more testable
In order to improve the testability of backend.RenewCrypto,
this patch does two things:
* It uses the previously introduced SSH utility functions.
Those are easier to consistently mock during unit tests
and they consistenly abstract the lower layer of file
operations on SSH keys.
* When calling the subfunctions to add and remove keys,
some of the optional parameters were not propagated,
which in tests will prevent the mocks from being
propagated.
Besides that, it also renames ReadRemoteSshPubKeys to
ReadRemoteSshPubKey, because that actually only fetches one
key.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 19 Jan 2016 13:28:16 +0000 (14:28 +0100)]
SSH testutils: add key generation
The SSH file manager which is used in unit tests
so far did not provide functionality to actually
generate a new key. This patch adds a very rudimentary
way of creating new keys (which works well enough
for our purposes).
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Fri, 15 Jan 2016 14:42:33 +0000 (15:42 +0100)]
Remove _ReplaceMasterKeyOnMaster
The somewhat cumbersome function _ReplaceMasterKeyOnMaster
is replaced with one of the ssh utility functions provied
in the previous patches.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Fri, 15 Jan 2016 10:18:24 +0000 (11:18 +0100)]
SSH utility functions for key manipulation
So far, the backend code contains a lot of (repetitive)
code to manipulate SSH keys on the local disk. This
patch adds utility functions for those basic operations
and also includes unit tests for those.
In the later patches of this series, those functions
will be used to simplify the code and increase the
code reusage.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Thu, 14 Jan 2016 13:35:50 +0000 (14:35 +0100)]
RenewCrypto: do not consult public key file
There is a bug in the current implementation of
backend.RenewCrypto. Before re-generating keys, it checks
if the current key of each node is in the Ganeti public key
file. This was intended as a security feature, but actually
does not work like that. The Ganeti public key file does
only contain the keys of the potential master candidates.
In case of a key-renewal, all nodes' keys are renewed and
that includes the normal nodes (which are not potential
master candidates). This patch removes these checks to
make sure renewal does not fail if a cluster contains
normal nodes.
Note: since potential master candidates are not fully
implemented yet, this did not show up on actual clusters.
The unit test which is implemented in a later patch of
this series revealed this flaw.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 13 Jan 2016 12:20:45 +0000 (13:20 +0100)]
SSH testutils: function to return all node UUIDs
This patch adds a utility function to the SSH test
utilities which returns all UUIDs of all nodes that
the file manager is aware of.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Fri, 15 Jan 2016 10:30:27 +0000 (11:30 +0100)]
Fix TestDetermineKeyBits
The test never ran, because it did not inherit from a
test case class. This patch fixes that and all nits that
made the tests fail.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Dimitris Aragiorgis [Wed, 20 Jan 2016 16:37:14 +0000 (18:37 +0200)]
Fix failover in case the source node is offline
Commit ff74b60 closes instance disks on the source node before
doing a failover. In case the node is offline this is not possible.
This patch proceeds with the failover in case the source node
is offline or the --ingore-consistency flag is set. Reduce also
some config lookups for the node's name.
This fixes Issue #1162.
Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Klaus Aehlig [Thu, 21 Jan 2016 10:02:48 +0000 (11:02 +0100)]
Merge branch 'stable-2.16' into stable-2.17
* stable-2.16
Document the increased timeout as an incompatible change
Increase timeouts for luxi by a factor of 3
Do not repeat constants in comments
Send messages as Strings
* stable-2.15
Catch IOError of SSH files when removing node
Fix renew-crypto on one-node-cluster
ssh_update: log data that is received
Increase timeout of RPC adding/removing keys
After TestNodeModify, fix the pool of master candidates
* stable-2.14
Test disk attachment with different primary nodes
Check for same primary node before disk attachment
Add detach/attach sequence test
Allow disk attachment with external storage
* stable-2.13
Run ssh-key renewal in debug mode during upgrade
* stable-2.12
Increase minimal sizes of test online nodes
Also log the high-level upgrade steps
Add function to provide logged user feedback
Run renew-crypto in upgrades in debug mode
Unconditionally log upgrades at debug level
Document healthy-majority restriction on master-failover
Check for healthy majority on master failover with voting
Add a predicate testing that a majority of nodes is healthy
Fix outdated comment
Pass arguments to correct daemons during master-failover
Fix documentation for master-failover
* stable-2.11
(no changes)
* stable-2.10
KVM: explicitly configure routed NICs late
Conflicts:
tools/post-upgrade: take all the flags
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Klaus Aehlig [Wed, 20 Jan 2016 11:13:33 +0000 (12:13 +0100)]
Document the increased timeout as an incompatible change
While the timeout for communication with luxid is mainly
an internal parameter, it also changes which response time
for Ganeti tools is still to be considered normal. Hence
warn users that might have higher level tools interacting
with Ganeti.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Wed, 20 Jan 2016 11:07:03 +0000 (12:07 +0100)]
Increase timeouts for luxi by a factor of 3
While sending answers lazily as Strings has reduced memory footprint
by over an order of magnitude, it seems that answer times have gotten
slower. Accept this trade off treating time for space and increase all
timeouts accordingly.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Wed, 20 Jan 2016 11:02:49 +0000 (12:02 +0100)]
Do not repeat constants in comments
...as this works against the idea of having all constants in one
central place so that they can be changed in a simple way.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Fri, 15 Jan 2016 13:59:47 +0000 (14:59 +0100)]
Merge branch 'stable-2.15' into stable-2.16
* stable-2.15
Catch IOError of SSH files when removing node
Fix renew-crypto on one-node-cluster
ssh_update: log data that is received
Increase timeout of RPC adding/removing keys
After TestNodeModify, fix the pool of master candidates
* stable-2.14
Test disk attachment with different primary nodes
Check for same primary node before disk attachment
Add detach/attach sequence test
Allow disk attachment with external storage
* stable-2.13
Run ssh-key renewal in debug mode during upgrade
* stable-2.12
Increase minimal sizes of test online nodes
Also log the high-level upgrade steps
Add function to provide logged user feedback
Run renew-crypto in upgrades in debug mode
Unconditionally log upgrades at debug level
Document healthy-majority restriction on master-failover
Check for healthy majority on master failover with voting
Add a predicate testing that a majority of nodes is healthy
Fix outdated comment
Pass arguments to correct daemons during master-failover
Fix documentation for master-failover
* stable-2.11
(no changes)
* stable-2.10
KVM: explicitly configure routed NICs late
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Klaus Aehlig [Fri, 15 Jan 2016 10:17:06 +0000 (11:17 +0100)]
Merge branch 'stable-2.14' into stable-2.15
* stable-2.14
Test disk attachment with different primary nodes
Check for same primary node before disk attachment
Add detach/attach sequence test
Allow disk attachment with external storage
* stable-2.13
Run ssh-key renewal in debug mode during upgrade
* stable-2.12
Increase minimal sizes of test online nodes
Also log the high-level upgrade steps
Add function to provide logged user feedback
Run renew-crypto in upgrades in debug mode
Unconditionally log upgrades at debug level
Document healthy-majority restriction on master-failover
Check for healthy majority on master failover with voting
Add a predicate testing that a majority of nodes is healthy
Fix outdated comment
Pass arguments to correct daemons during master-failover
Fix documentation for master-failover
* stable-2.11
(no changes)
* stable-2.10
KVM: explicitly configure routed NICs late
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Klaus Aehlig [Thu, 14 Jan 2016 17:01:17 +0000 (18:01 +0100)]
Merge branch 'stable-2.13' into stable-2.14
* stable-2.13
Run ssh-key renewal in debug mode during upgrade
* stable-2.12
Increase minimal sizes of test online nodes
Also log the high-level upgrade steps
Add function to provide logged user feedback
Run renew-crypto in upgrades in debug mode
Unconditionally log upgrades at debug level
Document healthy-majority restriction on master-failover
Check for healthy majority on master failover with voting
Add a predicate testing that a majority of nodes is healthy
Fix outdated comment
Pass arguments to correct daemons during master-failover
Fix documentation for master-failover
* stable-2.11
(no changes)
* stable-2.10
KVM: explicitly configure routed NICs late
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Thu, 14 Jan 2016 14:10:01 +0000 (15:10 +0100)]
Run ssh-key renewal in debug mode during upgrade
As errors during an upgrade of Ganeti are harder to
understand, as two versions of Ganeti are involved,
provide more debug information for everything that happens
during that process. Note that upgrades are a rare event,
so we do not have to worry about the size of log files
too much.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Thu, 14 Jan 2016 13:07:02 +0000 (14:07 +0100)]
Merge branch 'stable-2.12' into stable-2.13
* stable-2.12
Increase minimal sizes of test online nodes
Also log the high-level upgrade steps
Add function to provide logged user feedback
Run renew-crypto in upgrades in debug mode
Unconditionally log upgrades at debug level
Document healthy-majority restriction on master-failover
Check for healthy majority on master failover with voting
Add a predicate testing that a majority of nodes is healthy
Fix outdated comment
Pass arguments to correct daemons during master-failover
Fix documentation for master-failover
* stable-2.11
(no changes)
* stable-2.10
KVM: explicitly configure routed NICs late
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Tue, 8 Dec 2015 16:05:10 +0000 (17:05 +0100)]
Increase minimal sizes of test online nodes
A lot of our tests work by generating a node and a
strictly smaller instance and then continue under
the assumption that the instance will fit on the node.
To obtain a strictly smaller instance, we take an instance
of size at most half the free resources of the node. The
problem with this approach is that we also require minimal
resources of an instance (for examples to be realistic); now,
this can lead to an upper bound lower than the lower bound
and, by the way QuickCheck's `choose` works, still a value
between these bounds is chosen, violating the assumptions
about node and instance sizes.
To avoid those problems, set the minimal resources of an
allocatable node so that half of them is still bigger than
the minimal resources of an instance.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Cherry-picked-from:
6ccf05c1507c58e
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Klaus Aehlig [Tue, 12 Jan 2016 15:46:43 +0000 (16:46 +0100)]
Merge branch 'stable-2.11' into stable-2.12
* stable-2.11
(no changes)
* stable-2.10
KVM: explicitly configure routed NICs late
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Helga Velroyen [Tue, 12 Jan 2016 12:38:32 +0000 (13:38 +0100)]
Clean up after failed node-add-pre hooks
If the pre hooks of a node adding operation fail, so far
a stray key of the node to be added was left on the
master node. This patch makes sure it is cleaned up
in case of a hook failure.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 12 Jan 2016 13:57:47 +0000 (14:57 +0100)]
Light-weight SSH key removal
This patch adds an RPC call, which is a very light-weight
version of removing an SSH key from the cluster. It simply
only removes it from the public key file of the master.
This is used later to clean up in case the pre-hooks for
adding a node fail. When adding a node with 'gnt-node add',
the client code in gnt_node adds the key to the public
key file. If the hooks fail, so far this key was not
cleaned up and manual intervention was necessary.
To avoid any abuse of the RPC call, it includes a safety
check which makes sure that only keys of nodes that are
not in the cluster anymore (and thus are stray keys).
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 12 Jan 2016 12:49:59 +0000 (13:49 +0100)]
Introduce HooksAbortCallBack
There is currently no way to clean up anything after (pre)
hooks failed. LUs have a hook that is called after the hooks
finish successfully, but any exception that aborts the hook
execution is bubbled up till mcpu and then ignored.
This patch introduces another callback called
'HooksAbortCallBack'. Similar to 'HooksCallBack', this
callback is called after the hook execution, but in this
case only if the execution fails with an exception.
After the hook is called, the exception is rethrown in
order to maintain the control flow as it was before.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 12 Jan 2016 10:39:17 +0000 (11:39 +0100)]
Add useful hints to hooks documentation
Our documentation about hooks is lacking some useful
hints for people setting up hooks for the first time.
This patch adds the information that was otherwise
only available from reading the code or the mailinglist's
archive.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Klaus Aehlig [Tue, 12 Jan 2016 10:37:13 +0000 (11:37 +0100)]
Also log the high-level upgrade steps
The upgrade of a Ganeti cluster is done in several
high-level steps ("Draining queue", "Pausing the watcher",
"Stopping daemons", ...). Log those headings as well in
order to simplify reading the log file; with these headings,
it is more easy to understand which goal is aimed for with
all the micro-step RunCmd log entries.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Tue, 12 Jan 2016 14:54:33 +0000 (15:54 +0100)]
Add function to provide logged user feedback
Add a utility function that provides feedback to the
user on stdout that is additionally logged (at INFO level)
in the log file.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Mon, 11 Jan 2016 17:11:43 +0000 (18:11 +0100)]
Run renew-crypto in upgrades in debug mode
As errors during an upgrade of Ganeti are harder to
understand, as two versions of Ganeti are involved,
provide more debug information for everything that happens
during that process. Note that upgrades are a rare event,
so we do not have to worry about the size of log files
too much.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Tue, 12 Jan 2016 09:38:50 +0000 (10:38 +0100)]
Unconditionally log upgrades at debug level
Cluster upgrades to a new minor version of Ganeti are a rare
operation (in fact, new minor versions are released only every
3 months). Therefore, we do not have to worry about increased
size of log files. However, upgrades of Ganeti are complicated
in the sense that, should something break during the upgrade, it
is not immediately obvious, in which Ganeti state is left in. Therefore,
always provide full log information on upgrades. Fixes issue 1137.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Mon, 30 Nov 2015 14:19:38 +0000 (15:19 +0100)]
Send messages as Strings
ByteStrings are a more compact representation of a sequence of octets
than are Strings. However, converting a String into a ByteString, even
a lazy one, looks at a huge number of characters before the first goes
out of scope; thus the String gets enforced effectively. As Strings,
as a list of unicode characters, have a quite memory-intense representation,
this loss of lazyness results in a memory spike that is quite significant,
at least for restricted environments like a Xen dom0, when sending the
whole Ganeti configuration.
Therefore, send messages as String over the wire to preserve lazyness.
This is sound, as our JSON representation is 7-bit clean, and hence
every character coincides with its UTF8 encoding. On a larger cluster,
this saved an order of magnitude in peak memory usage.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Mon, 11 Jan 2016 14:32:47 +0000 (15:32 +0100)]
Introduce backoff to RetryByNumberOfTimes
This patch adds a backing-off mechanism to the function
RetryByNumberOfTimes. This is useful for example when SSH
connections fail in a flaky network. The original version of
RetryByNumberOfTimes immediately retried failed SSH calls,
but that might not be enough to recover from a network
problem.
The patch adds an additional parameter 'backoff' which
specifies the base number of seconds of the backoff. That
means after the first failed try, a delay is added as long
as the backoff parameter specifies. With each additional
failed try, the delay is doubled until the maximum
number of retries is hit.
Note that the backoff parameter is not a keyword argument,
which might have been more convenient. That's because
otherwise RetryByNumberOfTimes would no longer be able
to propagate *args and **kwargs to the function to be
called with retries.
Also note that there is a function "Retry" in the same
package, which already provides somewhat complicated
timeout capabilities. However, we did not merge these
two functions, because Retry also does not propagate
*args and **kwargs properly which is something we
depend on in backend.py.
This patch also updates the unit tests and mocks the
sleep function in the backend.py's unit tests to not
slow down the tests.
This fixes issue 1078.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Mon, 11 Jan 2016 13:19:45 +0000 (14:19 +0100)]
Unit tests for RetryByNumberOfTimes
As this patch series will alter the behavior of
the function RetryByNumberOfTimes, we first add a
few unit tests to ensure we don't break anything.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Klaus Aehlig [Mon, 11 Jan 2016 11:30:30 +0000 (12:30 +0100)]
Merge branch 'stable-2.10' into stable-2.11
* stable-2.10
KVM: explicitly configure routed NICs late
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Lisa Velden [Mon, 11 Jan 2016 11:12:49 +0000 (12:12 +0100)]
Test disk attachment with different primary nodes
Test a detach/attach sequence with a DRBD disk and a different primary
node for the disk and the instance. This should raise an exception.
Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Lisa Velden [Mon, 11 Jan 2016 11:09:47 +0000 (12:09 +0100)]
Check for same primary node before disk attachment
Make sure a DRBD disk has the same primary node as the instance where it
will be attached to.
Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Klaus Aehlig [Fri, 8 Jan 2016 13:51:20 +0000 (14:51 +0100)]
Document healthy-majority restriction on master-failover
The previous patch introduced a behavioral change for master-failover:
it is rejected unless a majority of nodes is healthy or the --no-voting
option is given. (While we in general do not change behavior on a stable
branch, rejecting an operation that can be retried with different command-line
options is better than breaking the cluster completely.) Document this.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Fri, 8 Jan 2016 10:37:17 +0000 (11:37 +0100)]
Check for healthy majority on master failover with voting
The normal procedure for a master failover is that, after telling
each node the new master, the daemons on the new master node are
started the standard way, i.e., with voting. This, however, requires
that a majority of nodes is still healthy; otherwise, the failover
will result in the daemons not starting and thus a broken cluster.
Therefore, reject master failovers with voting, unless we can verify
that a majority of nodes is still responding.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Fri, 8 Jan 2016 11:26:57 +0000 (12:26 +0100)]
Add a predicate testing that a majority of nodes is healthy
For standard master failover (with voting), it is necessary
that the majority of nodes is still reachable and can answer
questions about which node is master. Add a predicate verifying
that this is still true.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Klaus Aehlig [Fri, 8 Jan 2016 10:54:47 +0000 (11:54 +0100)]
Fix outdated comment
Commit
5e641d0a introduced also counting the vote of
the node itself. Adapt the parameter description
accordingly.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Helga Velroyen [Thu, 17 Dec 2015 09:03:17 +0000 (10:03 +0100)]
Catch IOError of SSH files when removing node
This patch catches an IOError when a node is removed
from a cluster and the SSH files of the node are messed
up. Previously, this caused the removal to fail, which
is not exactly what you want when removing a messed
up node.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Cherry-picked-from:
a856040abc755b4
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 16 Dec 2015 10:03:23 +0000 (11:03 +0100)]
Fix renew-crypto on one-node-cluster
There was a bug which made 'gnt-cluster renew-crypto'
crash if it is a one-node cluster. This patch fixes
it by checking if there are any non-master nodes
to update at all.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Cherry-picked-from:
88ac338d88465cc0b
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 15 Dec 2015 14:03:53 +0000 (15:03 +0100)]
ssh_update: log data that is received
Debugging ssh_update can be annoying, because the data
used as input is not dumped anywhere. This patch logs
makes sure it gets logged (at DEBUG level) when
ssh_update receives the data.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Cherry-picked-from:
5c370ec180
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Apollon Oikonomopoulos [Wed, 2 Dec 2015 12:35:42 +0000 (14:35 +0200)]
KVM: explicitly configure routed NICs late
Commit
cc8a8ed7 outlined the reasons for configuring bridged NICs early
during live migration and routed NICs after migration has been finished.
Back then these were the only types of NICs available, however with the
introduction of OVS support this has changed.
Since OVS bridges are essentially bridges, the considerations outlined
in
cc8a8ed7 still apply: in particular, we do not want to lose the
gratuitous ARP sent out by the KVM NICs, so we have to configure
the OVS interfaces early in the migration process as well.
Rather than explicitly configure bridged and OVS interfaces early, we
prefer to explicitly configure routed interfaces late, since this leads
to more compact code.
Signed-off-by: Apollon Oikonomopoulos <apoikos@gmail.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Thu, 7 Jan 2016 13:27:29 +0000 (14:27 +0100)]
Increase timeout of RPC adding/removing keys
This patch increases the timeout for the RPC calls that
add and remove SSH keys to the cluster. This is necessary,
because in big clusters the distribution/removal of a
key takes too long as Ganeti has to contact every node in
the cluster.
This patch increases the timeout from URGENT to FAST
(the next higher option).
The alternatives to this include splitting up the
RPC call to several calls, which will add addiional
overall runtime and RPC overhead as well as security
implications. Since the higher timeout was tested
in a big cluster, we go with this for now.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Helga Velroyen [Mon, 4 Jan 2016 15:02:49 +0000 (16:02 +0100)]
Disable file-logging for tools on ext. nodes
Some tools not necessarily run on Ganeti nodes, for example
the 'move-instance' tool which can be run from any machine
that has RAPI access. With the changes in tool-logging, the
default assumes that the tool is run on a Ganeti node, because
we expect the '/var/log/ganeti/' directory to be present.
However, to use the same infrastructure for tools that do
not fulfill this requirement, this patch adds an option to
disable file logging in this case.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Thu, 17 Dec 2015 10:26:41 +0000 (11:26 +0100)]
Update documentation of gnt-node {add,remove,modify}
This patch brings the documentation (NEWS, man page) of
'gnt-node {add,remove,modify}' up to date with the recent
changes.
  Â
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Signed-off-by: Helga Velroyen <helgav@google.com>
Helga Velroyen [Thu, 17 Dec 2015 10:30:43 +0000 (11:30 +0100)]
Propagate --debug/verbose in gnt-node add/remove/modify
This patch adds and propagates the --verbose and --debug
options of 'gnt-node add/remove/modify' to the calls of
'ssh_update'. This way the log level can be easily
controlled and thus debugging SSH errors will become
easier.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Thu, 17 Dec 2015 09:03:17 +0000 (10:03 +0100)]
Catch IOError of SSH files when removing node
This patch catches an IOError when a node is removed
from a cluster and the SSH files of the node are messed
up. Previously, this caused the removal to fail, which
is not exactly what you want when removing a messed
up node.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 16 Dec 2015 10:03:23 +0000 (11:03 +0100)]
Fix renew-crypto on one-node-cluster
There was a bug which made 'gnt-cluster renew-crypto'
crash if it is a one-node cluster. This patch fixes
it by checking if there are any non-master nodes
to update at all.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 16 Dec 2015 09:26:02 +0000 (10:26 +0100)]
Update documentation of renew-crypto
Update the documentation (NEWS, man gnt-cluster) with
the changes of this patch series so far. In particular,
mention the new tools logfile and the additional
control options of the log level for 'gnt-cluster
renew-crypto'.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 16 Dec 2015 09:04:15 +0000 (10:04 +0100)]
Increase loglevel of renew-crypto after upgrades
As renew-crypto is a common operation that fails during
upgrades, let's always call it with verbose debug level
to make sure we can help our users debugging.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Wed, 16 Dec 2015 08:55:58 +0000 (09:55 +0100)]
Expose verbose/debug option to renew-crypto
To be able to increase the log level of SSH update
tools easily, this patch propagates the debug/verbose
option to the client side.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 15 Dec 2015 14:03:53 +0000 (15:03 +0100)]
ssh_update: log data that is received
Debugging ssh_update can be annoying, because the data
used as input is not dumped anywhere. This patch logs
makes sure it gets logged (at DEBUG level) when
ssh_update receives the data.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 15 Dec 2015 10:39:16 +0000 (11:39 +0100)]
Make all tools log with their name
The first patch of this series has introduced the option
of naming the tool that is logging to tools.log. This
patch updates all calls to SetupToolsLogging to
provide a toolname.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 15 Dec 2015 10:38:49 +0000 (11:38 +0100)]
Propagate verbose/debug option to ssh_update calls
This patch extends all backend functions that call
ssh_update by propagating the verbose and debug
options from the RenewCrypto function downwards.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Helga Velroyen [Tue, 15 Dec 2015 10:33:01 +0000 (11:33 +0100)]
Make SetupToolsLogging use tools logfile
This patch makes SetupToolsLogging use a dedicated
log file (tools.log) instead of spamming node-daemon.log.
Logging tools' output to node-daemon.log is not really
the correct thing to do, because tools can be called
without any node daemon running and thus it makes more
sense that they have their own log file.
This patch also makes SetupToolsLogging use the SetupTools
function, because nearly all functionality added before is
actually subsumed by SetupTools (like printing the
threadname). Only a slight modification of the log level
handling had to be added.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Klaus Aehlig [Tue, 22 Dec 2015 11:35:40 +0000 (12:35 +0100)]
After TestNodeModify, fix the pool of master candidates
The test TestNodeModify temporarily modifies the cluster parameter
candidate-pool-size, which controls the minimal desirable number of
master candidates. Depending on the size of the test cluster, this
temporary modification can be a decrease (for clusters with up to 10
nodes) or an increase (for clusters with 12 or more nodes). Ganeti's
behavior upon change of the candidate pool size is to promote nodes to
master candidates upon increase, but do nothing upon decrease. This is
a safe behavior, as too many master candidates is not a problem; the
chance of data loss is even smaller. However, it means that the test
has a size effect of, for large test cluster, increasing the actual
number of nodes that are master candidates. While not a problem for
correctness, this side effect does affect our performance tests (which
usually are run after the functional tests) as more master candidates
means more nodes to replicate information to.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Hrvoje Ribicic [Thu, 17 Dec 2015 00:18:50 +0000 (00:18 +0000)]
Pass arguments to correct daemons during master-failover
A master-failover can be executed with the --no-voting flag, making
Ganeti start daemons despite a lack of votes. This is necessary to
fail over a cluster reduced to two nodes. The feature has not
been working since 2.12 daemon refactoring, as the daemon parameters
were passed through environmental variables that were not updated.
This commit passes the parameters correctly, and fixes issue 1159.
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Helga Velroyen [Tue, 5 Jan 2016 10:13:22 +0000 (11:13 +0100)]
Merge branch 'stable-2.16' into stable-2.17
* stable-2.16
Fix typo 'option' instead of 'options'
Fix error message in attachInstanceDiskChecks
Update documentation of harep
Document harep --dry-run in the man page
Support --dry-run in harep
Add a --dry-run option to htools
* stable-2.15
Add more documentation to testutils_ssh.py
renew-crypto: use bulk-removal of SSH keys
Use bulk-removal of SSH keys for single keys
Bulk-removing SSH keys of diverse set of nodes
Bulk-removal of SSH keys of normal nodes
Bulk-remove SSH keys of potential master candidates
Bulk-removal of SSH keys
testutils: add keys to own 'authorized_keys' file
Make mock SSH file manager deal with lists
Don't deepcopy the config if the old value is not needed
Revision bump for 2.15.2
Update NEWS file for 2.15.2
Compute lock allocation strictly
* stable-2.14
Revision bump for 2.14.2
Update NEWS file for 2.14.2
Fix lines with more than 80 characters
Add more detach/attach sequence tests
Allow disk attachment to diskless instances
Improve tests for attaching disks
* stable-2.13
Revision bump for 2.13.3
Update NEWS file for 2.13.3
* stable-2.12
Bump revision number for 2.12.6
Update NEWS file for 2.12.6
Restrict showing of DRBD secret using types
Calculate correct affected nodes set in InstanceChangeGroup
* stable-2.11
Revision bump for 2.11.8
Update NEWS file for 2.11.8
* stable-2.10
Version bump for 2.10.8
Update NEWS file for 2.10.8
* stable-2.9
Bump revision number
Update NEWS file for 2.9.7 release
Improve RAPI section on security
QA: Ensure the DRBD secret is not retrievable via RAPI
Redact the DRBD secret in instance queries
Do not attempt to use the DRBD secret in gnt-instance info
Conflicts:
src/Ganeti/HTools/Program/Harep.hs
Resolutions:
Harep.hs: use detectBroken from Repair.hs
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Hrvoje Ribici <riba@google.com>
Helga Velroyen [Tue, 5 Jan 2016 09:49:13 +0000 (10:49 +0100)]
Fix typo 'option' instead of 'options'
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Helga Velroyen [Mon, 4 Jan 2016 16:07:50 +0000 (17:07 +0100)]
Merge branch 'stable-2.15' into stable-2.16
* stable-2.15
Add more documentation to testutils_ssh.py
renew-crypto: use bulk-removal of SSH keys
Use bulk-removal of SSH keys for single keys
Bulk-removing SSH keys of diverse set of nodes
Bulk-removal of SSH keys of normal nodes
Bulk-remove SSH keys of potential master candidates
Bulk-removal of SSH keys
testutils: add keys to own 'authorized_keys' file
Make mock SSH file manager deal with lists
Don't deepcopy the config if the old value is not needed
Revision bump for 2.15.2
Update NEWS file for 2.15.2
Compute lock allocation strictly
* stable-2.14
Revision bump for 2.14.2
Update NEWS file for 2.14.2
Fix lines with more than 80 characters
Add more detach/attach sequence tests
Allow disk attachment to diskless instances
Improve tests for attaching disks
* stable-2.13
Revision bump for 2.13.3
Update NEWS file for 2.13.3
* stable-2.12
Bump revision number for 2.12.6
Update NEWS file for 2.12.6
Restrict showing of DRBD secret using types
Calculate correct affected nodes set in InstanceChangeGroup
* stable-2.11
Revision bump for 2.11.8
Update NEWS file for 2.11.8
* stable-2.10
Version bump for 2.10.8
Update NEWS file for 2.10.8
* stable-2.9
Bump revision number
Update NEWS file for 2.9.7 release
Improve RAPI section on security
QA: Ensure the DRBD secret is not retrievable via RAPI
Redact the DRBD secret in instance queries
Do not attempt to use the DRBD secret in gnt-instance info
Conflicts:
NEWS
configure.ac
Resolutions:
NEWS: merge contents in right order
configure.ac: keep version number of 2.16
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Hrvoje Ribicic [Mon, 4 Jan 2016 13:16:45 +0000 (14:16 +0100)]
Fix documentation for master-failover
The gnt-cluster manual still specified that arguments should be passed
to the master daemon - one which no longer exists. This patch specifies
the two new daemons to which arguments should be passed instead.
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Helga Velroyen <helgav@google.com>
Lisa Velden [Wed, 16 Dec 2015 15:27:44 +0000 (16:27 +0100)]
Add detach/attach sequence test
Add a test for external storage.
Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Lisa Velden [Wed, 16 Dec 2015 13:57:43 +0000 (14:57 +0100)]
Allow disk attachment with external storage
As external storage is not associated with a node, we have to make an
exception for that before raising an error.
Signed-off-by: Lisa Velden <velden@google.com>
Reviewed-by: Hrvoje Ribicic <riba@google.com>
Helga Velroyen [Tue, 1 Dec 2015 15:20:57 +0000 (16:20 +0100)]
Add more documentation to testutils_ssh.py
This patch adds more comments to the functions in
testutils_ssh.py, in particular to clarify which function
returns what types of objects.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Tue, 24 Nov 2015 12:01:46 +0000 (13:01 +0100)]
renew-crypto: use bulk-removal of SSH keys
This patch makes renew-crypto use the newly introduced
bulk-removal function for SSH keys. This way the
complexity of renew-crypto (in terms of number of
SSH connections) becomes linear (from previously
quadratic).
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Tue, 24 Nov 2015 10:33:29 +0000 (11:33 +0100)]
Use bulk-removal of SSH keys for single keys
As the code for bulk-removal of SSH keys subsumes
the code for removing a single SSH key, let the
latter call the first.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Fri, 20 Nov 2015 10:16:58 +0000 (11:16 +0100)]
Bulk-removing SSH keys of diverse set of nodes
This patch adds a unit test where SSH keys of a diverse
set of nodes is removed. By 'diverse', we mean a set
consisting of master candidates, potential master
candidates, and normal nodes.
It also fixes some minor bug that surfaced with that
test.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Fri, 20 Nov 2015 09:41:12 +0000 (10:41 +0100)]
Bulk-removal of SSH keys of normal nodes
This patch adds a unit test for bulk-removing
normal nodes. Besides that, it fixes a small
bug that surfaced with that test.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Fri, 20 Nov 2015 09:30:08 +0000 (10:30 +0100)]
Bulk-remove SSH keys of potential master candidates
This patch adds a unit test for bulk-removing potential
master candidates.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Fri, 20 Nov 2015 09:11:44 +0000 (10:11 +0100)]
Bulk-removal of SSH keys
In order to improve the runtime complexity of
'renew-crypto', this patch adds a function to
bulk-remove SSH keys of nodes (in contrast to
the function that only removes one key at a time).
Within this patch, it is only called in a unit
test. Further patches will integrate and test it
further.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Tue, 24 Nov 2015 10:11:41 +0000 (11:11 +0100)]
testutils: add keys to own 'authorized_keys' file
This patch updates the SSH testutils to match reality better.
So far, the test framework did not consider the fact that
the key of each node should be added to it's own
'authorized_keys' file, even if the node is not a master
candidate. This patch fixes that to represent the production
behavior more accurately.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Helga Velroyen [Thu, 19 Nov 2015 15:13:17 +0000 (16:13 +0100)]
Make mock SSH file manager deal with lists
There was a subtle bug in the unit test of backend.py
which was masking another subtle bug in the test framework
in testutils_ssh.py.
As relict from some previous refactoring, the ssh.py
functions assume that there can be more than one public
key per node. The testutils so far assume there is only
one key per node and due to a bug, this cancelled out
nicely and was not found so far.
As we actually only have one key per node, the elegant
thing to do would be to adapt ssh.py rather than the
testutils, but that will break the interface of the
ssh_update.py tool. Since we would rather not do that
in a stable, branch, this patch adapts the testutils.
The adaption of the ssh.py will be done in a newer
branch then.
Additionally, this patch also sprinkles assertions
everywhere to ensure finding these kind of type messups
sooner.
Signed-off-by: Helga Velroyen <helgav@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Klaus Aehlig [Mon, 14 Dec 2015 14:08:22 +0000 (15:08 +0100)]
Don't deepcopy the config if the old value is not needed
The _UpgradeConfig function carries out internal upgrades of the
configuration, and additionally, if requested, saves the configuration
in case it changed in this process. To compare the old and the new
version, a deep copy of the old version is kept. As deep copying large
configurations is an expensive operation, only do it, if the value is
used afterwards.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Lisa Velden <velden@google.com>
Hrvoje Ribicic [Wed, 16 Dec 2015 12:16:57 +0000 (12:16 +0000)]
Revision bump for 2.15.2
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>
Hrvoje Ribicic [Wed, 16 Dec 2015 12:16:39 +0000 (12:16 +0000)]
Update NEWS file for 2.15.2
With the security information and a list of minor changes.
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>
Hrvoje Ribicic [Wed, 16 Dec 2015 11:09:38 +0000 (12:09 +0100)]
Merge branch 'stable-2.14' into stable-2.15
* stable-2.14
Revision bump for 2.14.2
Update NEWS file for 2.14.2
* stable-2.13
Revision bump for 2.13.3
Update NEWS file for 2.13.3
* stable-2.12
Bump revision number for 2.12.6
Update NEWS file for 2.12.6
* stable-2.11
Revision bump for 2.11.8
Update NEWS file for 2.11.8
* stable-2.10
Version bump for 2.10.8
Update NEWS file for 2.10.8
* stable-2.9
Bump revision number
Update NEWS file for 2.9.7 release
Improve RAPI section on security
Conflicts:
NEWS - Merge entries
configure.ac - Take 2.15 revision numbers
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>
Hrvoje Ribicic [Tue, 15 Dec 2015 17:54:17 +0000 (18:54 +0100)]
Revision bump for 2.14.2
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>
Hrvoje Ribicic [Tue, 15 Dec 2015 17:53:11 +0000 (18:53 +0100)]
Update NEWS file for 2.14.2
With the security issues text and a list of minor issues.
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>
Hrvoje Ribicic [Tue, 15 Dec 2015 14:44:16 +0000 (15:44 +0100)]
Merge branch 'stable-2.13' into stable-2.14
* stable-2.13
Revision bump for 2.13.3
Update NEWS file for 2.13.3
* stable-2.12
Bump revision number for 2.12.6
Update NEWS file for 2.12.6
* stable-2.11
Revision bump for 2.11.8
Update NEWS file for 2.11.8
* stable-2.10
Version bump for 2.10.8
Update NEWS file for 2.10.8
* stable-2.9
Bump revision number
Update NEWS file for 2.9.7 release
Improve RAPI section on security
Conflicts:
NEWS - Merged entries
configure.ac - Took 2.14 version numbers
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>
Hrvoje Ribicic [Mon, 14 Dec 2015 18:00:43 +0000 (19:00 +0100)]
Revision bump for 2.13.3
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>
Hrvoje Ribicic [Mon, 14 Dec 2015 17:59:26 +0000 (18:59 +0100)]
Update NEWS file for 2.13.3
With the security issues text and a list of minor issues.
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>
Hrvoje Ribicic [Mon, 14 Dec 2015 17:33:14 +0000 (18:33 +0100)]
Merge branch 'stable-2.12' into stable-2.13
* stable-2.12
Bump revision number for 2.12.6
Update NEWS file for 2.12.6
* stable-2.11
Revision bump for 2.11.8
Update NEWS file for 2.11.8
* stable-2.10
Version bump for 2.10.8
Update NEWS file for 2.10.8
* stable-2.9
Bump revision number
Update NEWS file for 2.9.7 release
Improve RAPI section on security
Conflicts:
NEWS - Merge entries
configure.ac - Take 2.13 version numbers
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Oleg Ponomarev <oponomarev@google.com>
Hrvoje Ribicic [Mon, 14 Dec 2015 16:42:03 +0000 (17:42 +0100)]
Bump revision number for 2.12.6
Signed-off-by: Hrvoje Ribicic <riba@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>