Red Hat OpenShift Container Platform is a Platform as a Service (PaaS) that provides developers and IT organizations with a cloud application platform for deploying new applications on secure, scalable resources with minimal configuration and management overhead. OpenShift Container Platform supports a wide selection of programming languages and frameworks, such as Java, Ruby, and PHP.
Built on Red Hat Enterprise Linux and Kubernetes, OpenShift Container Platform provides a secure and scalable multi-tenant operating system for today’s enterprise-class applications, while providing integrated application runtimes and libraries. OpenShift Container Platform brings the OpenShift PaaS platform to customer data centers, enabling organizations to implement a private PaaS that meets security, privacy, compliance, and governance requirements.
Red Hat OpenShift Container Platform version 3.9 (RHBA-2018:0489) is now available. This release is based on OpenShift Origin 3.9. New features, changes, bug fixes, and known issues that pertain to OpenShift Container Platform 3.9 are included in this topic.
To better synchronize versions of OpenShift Container Platform with Kubernetes, Red Hat did not publicly release OpenShift Container Platform 3.8 and, instead, is releasing OpenShift Container Platform 3.9 directly after version 3.7. See Installation for information on how this impacts installation and upgrade processes.
OpenShift Container Platform 3.9 is supported on RHEL 7.3 and 7.4 with the latest packages from Extras, including Docker 1.13. It is also supported on Atomic Host 7.4.5 and newer. The docker-latest package is now deprecated.
For initial installations, see the Installing a Cluster topics in the Installation and Configuration documentation.
To upgrade to this release from a previous version, see the Upgrading Clusters topic.
This release adds improvements related to the following components and concepts.
Now, when pruning images, you do not have to remove the actual image, just update etcd storage.
It is safer to run --keep-tag-revisions
and --keep-younger-than
. After this
is run, administrators can choose to run hard prune (which is safe to run as
long as the registry is put in read-only mode).
The installation playbooks in OpenShift Container Platform 3.9 have been updated to support Red Hat CloudForms Management Engine (CFME) 4.6, which is now currently available. See the new Deploying Red Hat CloudForms on OpenShift Container Platform topics for further information.
In addition, this release includes the following new features and updates:
OpenShift Container Platform template provisioning
Offline OpenScapScans
Alert management: You can choose Prometheus (currently in Technology Preview) and use it in CloudForms.
Reporting enhancements
Provider updates
Chargeback enhancements
UX enhancememts
CRI-O is a lightweight, native Kubernetes container runtime interface. By design, it provides only the runtime capabilities needed by the kubelet. CRI-O is designed to be part of Kubernetes and evolve in lock-step with the platform.
CRI-O brings:
A minimal and secure architecture.
Excellent scale and performance.
The ability to run any Open Container Initiative (OCI) or docker image.
Familiar operational tooling and commands.
To install and run CRI-O alongside docker
, set the following in the
[OSEv3:vars]
section
Ansible inventory file during cluster installation:
openshift_use_crio=true
This setting pulls the openshift3/cri-o system container image from the Red Hat Registry by default. If you want to use an alternative CRI-O system container image from another registry, you can also override the default using the following variable:
openshift_crio_systemcontainer_image_override=<registry>/<repo>/<image>:<tag>
The |
When CRI-O use is enabled, it is installed alongside docker
, which currently
is required to perform build and push operations to the registry. Over time,
temporary docker
builds can accumulate on nodes. You can optionally set the
following to enable garbage collection, which adds a daemonset to clean out the
builds:
openshift_crio_enable_docker_gc=true
When enabled, it will run garbage collection on all nodes by default. You can also limit the running of the daemonset on specific nodes by setting the following:
openshift_crio_docker_gc_node_selector={'runtime': 'cri-o'}
For example, the above would ensure it is only run on nodes with the runtime:
cri-o
label. This can be helpful if you are running CRI-O only on
some
nodes, and others are only running docker
.
See the upstream documentation for more information on CRI-O.
You can expand persistent volume claims online from {product-tile} for CNS glusterFS, Cinder, and GCE PD.
Create a storage class with allowVolumeExpansion=true
.
The PVC uses the storage class and submits a claim.
The PVC specifies a new increased size.
The underlying PV is resized.
You can expand persistent volume claims online from {product-tile} for CNS glusterFS volumes.
This can be done online from OpenShift Container Platform. Previously, this was only available from the Heketi CLI. You edit the PVC with the new size, triggering a PV resize. This is fully qualified for glusterFs backed PVs. Gluster-block PV resize will be added with RHEL 7.5.
Add allowVolumeExpansion=true
to the storage class.
Run:
$ oc edit pvc claim-name
Edit the spec.resources.requests.storage
field with the new value.
Container Native Storage GlusterFS is extended to provide volume metrics (including consumption) through Prometheus or Query.
Metrics are available from the PVC endpoint. This adds visibility to what is being allocated and what is being consumed. Previously, you could only see allocated size of the PVs. Now, you know how much is really consumed so, if needed, you can expand it before it runs out of space. This also allows administrators to do billing based on consumption, if needed.
Examples of added metrics include:
kubelet_volume_stats_capacity_bytes
kubelet_volume_stats_inodes
kubelet_volume_stats_inodes_free
kubelet_volume_stats_inodes_used
kubelet_volume_stats_used_bytes
In the OpenShift Container Platform advanced installer, the CNS block provisioner deployment is fixed and the CNS Un-install Playbook is added. This resolves the issue of CNS block deployment with OpenShift Container Platform and also provides a way to uninstall a failed installation of CNS.
CNS storage device details are added to the installer’s inventory file. The advanced installer manages configuration and deployment of CNS, file and block provisioners, registry, and ready-to-use PVs.
Updated guidance around Cluster Limits for OpenShift Container Platform 3.9 is now available.
This is a feature currently in Technology Preview and not for production workloads.
Device plug-ins allow you to use a particular device type (GPU, InfiniBand, or other similar computing resources that require vendor-specific initialization and setup) in your OpenShift Container Platform pod without needing to write custom code. The device plug-in provides a consistent and portable solution to consume hardware devices across clusters. The device plug-in provides support for these devices through an extension mechanism, which makes these devices available to containers, provides health checks of these devices, and securely shares them.
A device plug-in is a gRPC service running on the nodes (external to
atomic-openshift-node.service
) that is responsible for managing specific
hardware resources.
See the Developer Guide for further conceptual information about Device Plug-ins.
CPU Manager is a feature currently in Technology Preview and not for production workloads.
CPU Manager manages groups of CPUs and constrains workloads to specific CPUs.
CPU Manager is useful for workloads that have some of these attributes:
Require as much CPU time as possible.
Are sensitive to processor cache misses.
Are low-latency network applications.
Coordinate with other processes and benefit from sharing a single processor cache.
See Using CPU Manager for more information.
Device Manager is a feature currently in Technology Preview and not for production workloads.
Some users want to set resource limits for hardware devices within their pod definition and have the scheduler find the node in the cluster with those resources. While at the same time, Kubernetes needed a way for hardware vendors to advertise their resources to the kubelet without forcing them to change core code within Kubernetes
The kubelet now houses a device manager that is extensible through plug-ins. You load the driver support at the node level. Then, you or the vendor writes a plug-in that listens for requests to stop/start/attach/assign the requested hardware resources seen by the drivers. This plug-in is deployed to all the nodes via a daemonSet.
See Using Device Manager for more information.
Huge pages is a feature currently in Technology Preview and not for production workloads.
Memory is managed in blocks known as pages. On most systems, a page is 4Ki. 1Mi of memory is equal to 256 pages; 1Gi of memory is 256,000 pages, and so on. CPUs have a built-in memory management unit that manages a list of these pages in hardware. The Translation Lookaside Buffer (TLB) is a small hardware cache of virtual-to-physical page mappings. If the virtual address passed in a hardware instruction can be found in the TLB, the mapping can be determined quickly. If not, a TLB miss occurs, and the system falls back to slower, software-based address translation, resulting in performance issues. Since the size of the TLB is fixed, the only way to reduce the chance of a TLB miss is to increase the page size.
A huge page is a memory page that is larger than 4Ki. On x86_64 architectures, there are two common huge page sizes: 2Mi and 1Gi. Sizes vary on other architectures. In order to use huge pages, code must be written so that applications are aware of them. Transparent Huge Pages (THP) attempt to automate the management of huge pages without application knowledge, but they have limitations. In particular, they are limited to 2Mi page sizes. THP can lead to performance degradation on nodes with high memory utilization or fragmentation due to defragmenting efforts of THP, which can lock memory pages. For this reason, some applications may be designed to (or recommend) usage of pre-allocated huge pages instead of THP.
In OpenShift Container Platform, applications in a pod can allocate and consume pre-allocated huge pages.
See Managing Huge Pages for more information.
All outgoing external connections from a project share a single, fixed source IP address and send all traffic via that IP, so that external firewalls can recognize the application associated with a packet.
It is semi-automatic in that in the first half of implementing the automatic namespace-wide egress IP feature, it implements the "traffic" side. Namespaces with automatic egress IPs will send all traffic via that IP. However, it does not implement the "management" side. Nothing automatically assigns egress IPs to nodes yet. The administrator must do that manually.
See Managing Networking for more information.
Route configuration changes and process upgrades performed under heaving load have typically required a stop and start sequence of certain services, causing temporary outages.
In OpenShift Container Platform 3.9, HAProxy 1.8 sees no difference between updates and upgrades; a new process is used with a new configuration, and the listening socket’s file descriptor is transferred from the old to the new process so the connection is never closed. The change is seamless, and enables our ability to do things, like HTTP/2, in the future.
In OpenShift Container Platform, statefulsets, daemonsets, and deployments are now stable, supported, and out of Technology Preview.
Provides auditing of items that administrators would like to see, including:
The event timestamp.
The activity that generated the entry.
The API endpoint that was called.
The HTTP output.
The item changed due to an activity, with details of the change.
The user name of the user that initiated an activity.
The name of the namespace the event occurred in, where possible.
The status of the event, either success or failure.
Provides auditing of items that administrators would like to trace, including:
User login and logout from (including session timeout) the web interface, including unauthorized access attempts.
Account creation, modification, or removal.
Account role or policy assignment or de-assignment.
Scaling of pods.
Creation of new project or application.
Creation of routes and services.
Triggers of builds and/or pipelines.
Addition or removal or claim of persistent volumes.
Set up auditing in the master-config file, and restart the master-config service:
auditConfig: auditFilePath: "/var/log/audit-ocp.log" enabled: true maximumFileRetentionDays: 10 maximumFileSizeMegabytes: 10 maximumRetainedFiles: 10 logFormat: json policyConfiguration: null policyFile: /etc/origin/master/audit-policy.yaml webHookKubeConfig: "" webHookMode:
Example log output:
{"kind":"Event","apiVersion":"audit.k8s.io/v1beta1","metadata":{"creationTimestamp":"2017-09-29T09:46:39Z"},"level":"Metadata","timestamp":"2017-09-29T09:46:39Z","auditID":"72e66a64-c3e5-4201-9a62-6512a220365e","stage":"ResponseComplete","requestURI":"/api/v1/securitycontextconstraints","verb":"create","user":{"username":"system:admin","groups":["system:cluster-admins","system:authenticated"]},"sourceIPs":["10.8.241.75"],"objectRef":{"resource":"securitycontextconstraints","name":"scc-lg","apiVersion":"/v1"},"responseStatus":{"metadata":{},"code":201}}
The oc status
command provides an overview of the current project. This
provides similar output for upstream deployments as can be seen for downstream
DeploymentConfigs, with a nested deployment set:
$ oc status In project My Project (myproject) on server https://127.0.0.1:8443 svc/ruby-deploy - 172.30.174.234:8080 deployment/ruby-deploy deploys istag/ruby-deploy:latest <- bc/ruby-deploy source builds https://github.com/openshift/ruby-ex.git on istag/ruby-22-centos7:latest build #1 failed 5 hours ago - bbb6701: Merge pull request #18 from durandom/master (Joe User <joeuser@users.noreply.github.com>) deployment #2 running for 4 hours - 0/1 pods (warning: 53 restarts) deployment #1 deployed 5 hours ago
Compare this to the output from OpenShift Container Platform 3.7:
$ oc status In project dc-test on server https://127.0.0.1:8443 svc/ruby-deploy - 172.30.231.16:8080 pod/ruby-deploy-5c7cc559cc-pvq9l runs test
Dynamic Admission Controller Follow-up is a feature currently in Technology Preview and not for production workloads.
An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized. Example use cases include mutation of pod resources and security response.
See Custom Admission Controllers for more information.
Platform administrators now have the ability to turn off specific features to the entire platform. This assists in the control of access to alpha, beta, or Technology Preview features in production clusters.
Feature gates use a key=value pair in the master and kubelet configuration files that describe the feature you want to block.
kubernetesMasterConfig: apiServerArguments: feature-gates: - CPUManager=true
kubeletArguments: feature-gates: - DevicePlugin=true
OpenShift Container Platform 3.9 introduces significant refactoring and restructuring of the playbooks to improve performance. This includes:
Restructured playbooks to push all fact-gathering and common dependencies up into the initialization plays so they are only called once rather than each time a role needs access to their computed values.
Refactored playbooks to limit the hosts they touch to only those that are truly relevant to the playbook.
Quick Installation is now deprecated in OpenShift Container Platform 3.9 and will be completely removed in a future release.
Quick installation will only be capable of installing 3.9. It will not be able to upgrade from 3.7 or 3.8 to 3.9.
The installer automatically handles stepping the control plane from 3.7 to 3.8 to 3.9 and node upgrade from 3.7 to 3.9.
Control plane components (API, controllers, and nodes on control plane hosts) are upgraded seamlessly from 3.7 to 3.8 to 3.9. Data migration happens pre- and post- OpenShift Container Platform 3.8 and 3.9 control plane upgrades. Other control plane components (router, registry, service catalog, and brokers) are upgraded from OpenShift Container Platform 3.7 to 3.9. Nodes (node, docker, ovs) are upgraded directly from OpenShift Container Platform 3.7 to 3.9 with only one drain of nodes. OpenShift Container Platform 3.7 nodes operate indefinitely against 3.8 masters should the upgrade process need to pause in this state. Logging and metrics are updated from OpenShift Container Platform 3.7 to 3.9.
It is recommended that you upgrade the control plane and nodes independently. You can still perform the upgrade through an all-in-one playbook, but rollback is more difficult. Playbooks do not allow for a clean installation of OpenShift Container Platform 3.8.
See Upgrading Clusters for more information.
syslog Output Plug-in for fluentd is a feature currently in Technology Preview and not for production workloads.
You can send system and container logs from OpenShift Container Platform nodes to external endpoints using the syslog protocol. The fluentd syslog output plug-in supports this.
Logs sent via syslog are not encrypted and, therefore, insecure. |
See Sending Logs to an External Syslog Server for more information.
Prometheus remains in Technology Preview and is not for production workloads. Prometheus, AlertManager, and AlertBuffer versions are now updated and node-exporter is now included:
prometheus 2.1.0
Alertmanager 0.14.0
AlertBuffer 0.2
node_exporter 0.15.2
You can deploy Prometheus on an OpenShift Container Platform cluster, collect Kubernetes and infrastructure metrics, and get alerts. You can see and query metrics and alerts on the Prometheus web dashboard. Alternatively, you can bring your own Grafana and hook it up to Prometheus.
See Prometheus on OpenShift for more information.
Previously, Jenkins worker pods would often consume too much or too little memory. Now, a startup script intelligently looks at pod limits and environment variables are appropriately set to ensure limits are respected for spawned JVMs.
CLI plug-ins are now fully supported.
Usually called plug-ins or binary extensions, this feature allows you to
extend the default set of oc
commands available and, therefore, allows you to
perform new tasks.
See Extending the CLI for information on how to install and write extensions for the CLI.
Previously, there was not a way to set a default toleration on build pods so they could be placed on build-specific nodes. The build defaulter is now updated to allow the specification of a toleration value, which is applied to the build pod upon creation.
See Configuring Global Build Defaults and Overrides for more information.
Quickly get to the catalog from within a project by clicking Catalog in the left navigation.
To quickly find services from within project view, type in your search criteria.
You can now jump straight to certain pages after login. Access the menu from the account dropdown, choose your option, then log out, then log back in.
You can now configure the web console to log users out after a set timeout. The
default is 0
(never).
Set
the Ansible variable to the number of minutes:
openshift_web_console_inactivity_timeout_minutes=n
The web console is now separated out of the API server. The web console is packaged as a container image and deployed as a pod. Configure via the ConfigMap. Changes are auto-detected.
Masters are now schedulable and required to be schedulable for the web consoles deployments to work.
OpenShift Container Platform 3.9 introduces the following notable technical changes.
As of OpenShift Container Platform 3.9, manual upgrades are not supported. In a future release, this process will be removed.
In previous versions of OpenShift Container Platform, master hosts were marked as unschedulable nodes by default by the installer, meaning that new pods could not be placed on the hosts. Starting with OpenShift Container Platform 3.9, however, masters are marked schedulable automatically during installation and upgrade. This change is mainly so that the web console, which used to run as part of the master itself, can instead be run as a pod deployed to the master.
Starting in OpenShift Container Platform 3.9, masters are now marked as schedulable nodes by
default. As a result, the default node selector (defined in the master
configuration file’s projectConfig.defaultNodeSelector
field to determine
which node that projects will use by default when placing pods, and previously
left blank by default) is now set by default during cluster installations and
upgrades. It is set to node-role.kubernetes.io/compute=true
unless overridden
using the osm_default_node_selector
Ansible variable.
In addition, whether osm_default_node_selector
is set or not, the following
automatic labeling occurs for hosts defined in your inventory file during
installations and upgrades:
non-master, non-dedicated infrastructure nodes hosts (by default, this means
nodes with a region=infra
label) are labeled with
node-role.kubernetes.io/compute=true
, which assigns the compute
node role.
master nodes are labeled with node-role.kubernetes.io/master=true
, which
assigns the master
node role.
This ensures that the default node selector has available nodes to choose from when determining pod placement. See Configuring Node Host Labels for more details.
Starting in OpenShift Container Platform 3.9, Ansible must be installed via the
rhel-7-server-ansible-2.4-rpms
channel, which is included in RHEL
subscriptions.
OpenShift Container Platform 3.9 deprecates the following oc secrets
subcommands in favor
of oc create secret
:
new
new-basicauth
new-dockercfg
new-sshauth
Default values for template_service_broker_prefix
and
template_service_broker_image_name
in installer have been updated to be
consistent with other settings.
Previous values are:
template_service_broker_prefix="registry.example.com/openshift3/"
template_service_broker_image_name="ose-template-service-broker"
New values are:
template_service_broker_prefix="registry.example.com/openshift3/ose-"
template_service_broker_image_name="template-service-broker"
In an effort to provide greater flexibility for users, several instances of
become: no
on certain tasks and playbooks inside of openshift-anisble
are
now removed. These statements were primarily applied on local_action
and
delegate_to: localhost
commands for creating temporary files on the host
running Ansible.
If a user is running Ansible from a host that does not allow password-less
sudo
, some of these commands may fail if you run the ansible-playbook
with
the -b
(become
) command line switch, or if it has ansible_become=True
applied to the local host in the inventory or group_vars
.
Elevated permissions are not required on the local host when running
openshift-ansible
plays.
If target hosts (where OpenShift Container Platform is being deployed) require the use of
become
, it is recommended that you add ansible_become=True
for those hosts
or groups in inventory or group_vars
/host_vars
.
If the user is running as root on the local host or connection to the root user on the remote hosts instead of using become, then you should not notice a change.
Unqualified image specifications now default to docker.io
and require API
server configuration to resolve to different registries.
The batch/v2alpha1 ScheduledJob
objects are no longer supported. Use CronJobs
instead.
The autoscaling/v2alpha1
API group has been removed
For new installations of OpenShift Container Platform 3.9 , disabling swap is a strong recommendation. For OpenShift Container Platform 3.8, the OpenShift Container Platform start node requires swap to be disabled. This is already done as part of the Ansible node installation.
The oadm
command is now deprecated. Use oc adm
instead.
The core workloads API, which is composed of the DaemonSet
, Deployment
,
ReplicaSet
, and StatefulSet kinds
, has been promoted to GA stability in the
apps/v1
group version. As such, the` apps/v1beta2` group version is
deprecated, and all new code should use the kinds in the apps/v1 group version.
For OpenShift Container Platform this means the statefulsets, daemonsets, and deployments are
now stable and supported.
In OpenShift Container Platform 3.9, the Administrator Solutions guide is removed from the OpenShift Container Platform documentation. See the Day Two Operations Guide instead.
This release fixes bugs for the following components:
Builds
Previously, builds selected the secret to be used for pushing the output image at the time they were started. When a build started before the default service account secrets for a project were created, the build may not have found a suitable secret for pushing the image, resulting in the build failing when it went to push the image. With this fix, the build is held until the default service account secrets exist, ensuring that if the default secret is suitable for pushing the image, it can and will be used. As a result, initial builds in a newly created project are no longer at risk of failing if the build is created before the default secrets are populated. (BZ#1333030)
Command Line Interface
The systemd
units for masters changed without the diagnostics being updated.
This caused the diagnostics to silently check for master systemd
units that
did not exist, and problems were not reported. With this fix, diagnostics check
for correct master unit names and problems with master systemd
units and logs
may be found.
(BZ#1378883)
Containers
If a container shares namespace with another container, then they would share
the namespace path. If you run the exec
command in the first container, it
only reads the namespace paths stored in the file and joins those namespaces.
So, if the second container has already been stopped, the exec
command in the
first container will fail. As a result, this fix saves namespace paths no matter
if containers share namespaces.
(BZ#1510573)
Images
Docker has a known "zombie process" phenomenon that impacted the OpenShift
Jenkins image, causing operating system-level resources to be exhausted as these
“zombie processes” accumulated. With this fix, the OpenShift Jenkins image now
leverages one of the Docker image init
implementations to launch Jenkins,
monitor, and handle any “zombie child processes”. As a result, “zombie
processes” no longer accumulate.
(BZ#1528548)
Due to a fault in the scheduler implementation, the
ScheduledImageImportMinimumIntervalSeconds
setting was not correctly observed,
causing OpenShift Container Platform to attempt to import scheduled images at the wrong
intervals. This is now resolved.
(BZ#1543446)
Previously, OpenShift would erroneously re-import all tags on an image stream, regardless if marked as scheduled or not, if any tag on the image stream was marked as scheduled. This behavior is now resolved. (BZ#1515060)
Image Registry
The signature importer tried to import signatures from the internal registry without credentials, causing the registry to check if the anonymous user could get signatures using SAR requests. With this bug fix, the signature importer skips the internal registry because the internal registry and the signature importer work with the same storage, resulting in no SAR requests. (BZ#1543122)
There was no check of the number of components in the path, causing the data to be placed in the storage but not be written to the database. With this bug fix, an early check of the path was added. (BZ#1528613)
Installer
The Kubernetes service IP address was not added to no_proxy
list for the
docker-registry during installation. As a result, internal registry requests
would be forced to use the proxy, preventing logins and pushes to the internal
registry. The installer was changed to add the Kubernetes service IP to the
no_proxy
list.
(BZ#1504464)
The installer was pulling the incorrect efs-provisioner image, which caused the installation of the provisioner pod to fail to deploy. The installer was changed to pull the correct image. (BZ#1523534)
When installing OpenShift Container Platform with a custom registry, the installer was using
the default registry. The registry console default image is now defined as a
fully qualified image registry.access.redhat.com/openshift3/registry-console
which means that when a custom registry is specified via oreg_url
and image
streams are modified to use that custom registry the registry console will also
utilize the custom registry.
(BZ#1523638)
Running the redeploy-etcd-ca.yml playbook did not update the ca.crt
used
by etcd system container. The code was changed so that the playbook properly
updates the the etcd ca.crt in /etc/etcd/ca.crt as expected.
(BZ#1466216)
Following a successful deployment of CNS/CRS with glusterblock, OpenShift Container Platform logging and metrics can be deployed using glusterblock as their backend storage for fault-tolerant, distributed persistent storage. (BZ#1480835)
When upgrading from 3.6 to 3.7, the user wanted the Hawkular OpenShift Agent
pods deactivated. But, after upgrade, the HOSA pods are still being deployed. A
new playbook, uninstall_hosa.yaml, has been created to remove HOSA from a
OpenShift Container Platform cluster when openshift_metrics_install_hawkular_agent=false
in
the Ansible inventory file.
(BZ#1497408)
Because registry credentials for the broker were stored in a ConfigMap, sensitive credentials could be exposed in plain text. A secret is now created to store the credentials Registry credentials are no longer visible in plaintext. (BZ#1509082)
Because of incorrect naming, the uninstall playbook did not remove the tuned-profiles-atomic-openshift-node package. The playbook is now corrected and the package is removed upon uninstallation of OpenShift Container Platform. (BZ#1509129)
When running the installer with the
openshift_hosted_registry_storage_volume_size
parameter configured with Jnja
code, the installation failed during persistent volume creation. The code is now
fixed to properly interpret the Jinja code.
(BZ#1518386)
During disconnected installations, the service catalog was attempting to pull
down images from the configured registry. This caused the installation to fail
as the registry is not available during a disconnected installation. The
imagePullPolicy
in the installer was changed to ifNotPresent
. If the image
is present, the service catalog will not attempt to pull it again, and the
disconnected installation of the service catalog will proceed.
(BZ#1524805)
When provisioning hosts with an SSH proxy configured, the masters would never appear marked as up. With this bug fix, the task is changed to use an Ansible module that respects SSH proxy configuration. As a result, Ansible is able to connect to the hosts and they are marked as up. (BZ#1541946)
In an HTTPS environment, the service catalog installation was failing because
the playbook attempted to contact the API server using cURL without the
--noproxy
option specified. The command in the playbook was changed to include
--noproxy
and the installer performs as expected.
(BZ#1544645)
Previously, the storage type for Elasticsearch data centers was not preserved when upgrading/rerunning. This caused the existing storage type to be overwritten. This bug fix preserves the storage type as the default (using an inventory variable if specified). (BZ#1496758)
Previously, the docker daemon was incorrectly restarted when redeploying node
certificates. This caused unnecessary downtime in nodes since
atomic-openshift-node
was the only component loading the kubeconfig. This bug
fix adds a flag to check if a new Certificate Authority (CA) is being deployed.
If not, then restarting Docker is skipped.
(BZ#1537726)
Previously, the docker_image_availability
check did not take into account
variables that override specific container images used for containerized
components. This caused the check to incorrectly report failures when looking
for the default images when the overridden images were actually available. As a
result of this bug fix, the check should accurately report whether the necessary
images are available.
(BZ#1538806)
When determining if a persistent volume claim (PVC) should be created for Elasticsearch, we used a legacy variable, which did not correctly evaluate if a PVC was necessary when creating a Network File System (NFS)-backed persistent volume (PV). This bug fix correctly evaluates if a PVC is necessary for the deployment configuration. (BZ#1538995)
Previously, when configuring the registry for Azure Blob storage, the realm of
core.windows.net
was specified by default. This bug fix allows you to change
openshift_hosted_registry_storage_azure_blob_realm
to the value that you want
to use. (BZ#1491100)
A new playbook has been introduced that uninstalls an existing GlusterFS deployment. This playbook removes all existing resources, including pods and services. This playbook also, optionally, removes all data and configuration from the hosts that were running GlusterFS pods. (BZ#1497038)
Logging
Previously, the OpenShift Container Platform logging system did not support CRI-O. This bug fix added a parser for CRI-O formatted logs. As a result, both system and container logs can be collected. (BZ#1517605)
When redeploying logging, we previously attempted to maintain any changes that were made to the ConfigMaps post-installation. It was difficult to let users specify the contents of a ConfigMap file while still needing the ability to provide the configurations required for the different Elasticsearch, Fluentd, and Kibana (EFK) stack components. This bug fix created a patch based on changes made post-deployment and applies that patch to the files provided by the installer. (BZ#1519619)
Web Console
The Kibana page previously displayed OPENSHIFT ORIGIN in the upper left-hand corner of the OpenShift Container Platform web console. This bug fix replaces the Origin header image with the OpenShift Container Platform header image. As a result, the Kibana page now displays the desired header. (BZ#1546311)
Both the OpenShift Container Platform DeploymentConfig
and Kubernetes extensions/v1beta1
Deployment resources were labeled with deployment on the web console overview,
so you could not differentiate the resources. DeploymentConfig
resources on
the Overview page are now labelled with DeploymentConfig
.
(BZ#1488380)
The web console’s pod status filter did not correctly display pod init status
when an error prevented the pod from initializing, including and init status of
error. If a pod has an Init:Error
status, the pod status correctly displays
Init Error instead of Pod Initializing.
(BZ#1512473)
Previously, switching tabs in the web console page for a pipeline build configuration caused some content on the page to no longer be visible while the page reloaded. Switching tabs no longer reloads the entire page, and content is correctly displayed. (BZ#1527346)
By default, an old version of the builder image was shown when you added a builder to a project and selected by default during builder configuration. This gave the wrong impression that your only choice was an old version of a language or framework. The version number is no longer shown in the wizard title, and the newest available version is selected by default. (BZ#1542669)
If you used some browsers, you could not consistently use the right click menu to copy and paste text from internal editors that used the ACE editor library, including the YAML, Jenkinsfile, and Dockerfile editors. This update uses a newer version of the ACE editor library, so the right click menu options work throughout the console. (BZ#1463617)
Previously, browsers would use the default behavior for the Referrer-Policy
because Referrer-Policy header was not sent by the console. Now the console
correctly sends the Referrer-Policy header, which is set to
strict-origin-when-cross-origin
, and browsers that listen to the
Referrer-Policy header follow the strict-origin-when-cross-origin policy
for
the web console.
(BZ#1504571)
Previously, users with read access to the project saw webhook secret values because they were stored as strings in the build. These users could use these values to trigger builds even though they had only read access to the project. Now webhook secrets are defined as secret objects in the build instead of strings. Users with read only access to the project cannot see the secret values or use them to trigger builds by using the webhook. (BZ#1504819)
Previously, adding the same persistent volume claim more than once to a deployment in the web console caused pods for that deployment to fail. The web console incorrectly created a new volume when it added the second PVC to the deployment instead of reusing the existing volume from the pod template spec. Now, the web console reuses the existing volume if the same PVC is listed more than once. This behavior lets you add the same PVC with different mount paths and subpaths as needed. (BZ#1527689)
Previously, it was not clear enough that you can not select an Image Name from the Deploy Image window if you are also creating a new project. The help text that explains that you can only set an Image Name for existing projects is easier to find. (BZ#1535917)
Previously, the secrets page in the web console did not display labels. You can now view the labels for a secret like other resources. (BZ#1545828)
Sometimes the web console displayed a process template page even if you did not have permissions to process templates. If you tried to process the template, an error displayed. Now you can no longer view process templates if you cannot process them. (BZ#1510786)
Previously, the Clear Changes button did not correctly clear edits to the Environment From variables in the web console environment variable editor. The button now correctly resets edits to Environment From variables. (BZ#1515527)
By default, dialogs in the web console can be dismissed by clicking in the negative space surrounding the dialog. IAs a result, the warning dialog could be inadvertently dismissed. With this bug fix, the warning dialog’s configuration was changed so that it can only be dismissed by clicking one of the buttons in the dialog. The warning dialog can no longer be inadvertently dismissed by the user, as clicking one of the dialog’s buttons is now required in order to close the dialog. (BZ#1525819)
Master
Due to a fault in the scheduler implementation, the
ScheduledImageImportMinimumIntervalSeconds
setting was not correctly observed,
causing OpenShift Container Platform to attempt to import scheduled images at the wrong
intervals. With this bug fix, the issue is now resolved.
(BZ#1515058)
Networking
The OpenShift Container Platform node was not waiting long enough for the VNID while the master assigns the VNID and it could take a while to propagate to the node. As a result, pod creation fails. Increase the timeout from 1 to 5 seconds for fetching VNID on the node. This bug fix allows pod creation to succeed. (BZ#1509799)
It is now possible to specify a subnet length as part of the EGRESS_SOURCE
variable passed to an egress router (for example, 192.168.1.100/24
rather than
192.168.1.100
). In some network configurations (such as if the gateway address
was a virtual IP that might be backed by one of several physical IPs at
different times), ARP traffic between the egress router and its gateway might
not function correctly if the egress router is not able to send traffic to other
hosts on its local subnet. By specifying EGRESS_SOURCE
with a subnet length,
the egress router setup script will configure the egress pod in a way that will
work with these network setups.
(BZ#1527602)
In some circumstances, iptables rules could become reordered in a way that would cause the per-project static IP address feature to stop working for some IP addresses. (For most users, egress IP addresses that ended with an even number would continue to work, but egress IP addresses ending with an odd number would fail.) Therefore, external traffic from pods in a project that was supposed to use a per-project static IP address would end up using the normal node IP address instead. The iptables rules are changed so that they now have the expected effect even when they get reordered. With this bug fix, the per-project static egress IP feature now works reliably. (BZ#1527642)
Previously, the egress IP initialization code was only run when doing a full SDN
setup, and not when OpenShift services were restarted and found any existing
running SDN. This resulted in failure to create new per-project static egress
IPs (HostSubnet.EgressIPs
). This issue is now fixed and per-project static
egress IPs works correctly after a node restart.
(BZ#1533153)
Previously, OpenShift was setting colliding host-subnet values, which resulted in pod IP network to became unavailable across the nodes. This was because the stale OVS rules were not cleared during node startup. This is now fixed and the stale OVS rules are cleared on node startup. (BZ#1539187)
With previous version, if an static IP addressed was removed from a project and then added back to the same project, it did not worked correctly. This is now fixed, removing and re-adding static egress IPs works. (BZ#1547899)
Previously, when OpenShift was deployed on OpenStack, there were few required
iptables
rules that were not created automatically, which resulted in errors
in pop-to-pod communication between pods on different nodes. The Ansible
OpenShift installer now sets the required iptables
rules automatically.
(BZ#1493955)
There was a race condition in the startup code that relied on the node setup,
setting a field that the userspace proxy needed. When the network plugin was not
used (or if it was fast) the userspace proxy setup ran sooner and resulted in
reading a nil value for the IP address of the node. Later when the proxy (or the
unidler
which uses it) was enabled, it would crash because of the nil IP
address value. This issue is now fixed. A retry loop is added that waits for the
IP address value to be set and the userspace proxy and unidler
work as expected.
(BZ#1519991)
In some circumstances, nodes were receiving a duplicate out-of-order HostSubnet
deleted
event from the master. During processing of this duplicate event, the
node ended up deleting OVS flows corresponding to an active node, disrupting
communications between these two nodes. In the latest version. the HostSubnet
event-processing now checks for and ignores duplicate events. Thus, the OVS
flows are not deleted, and pods communicate normally.
(BZ#1544903)
Previously, the openshift ex dockergc
command to cleanup docker images, failed
occasionally. This issue is now fixed.
(BZ#1511852)
Previously, nested secrets did not get mounted in pod. This issue is now fixed. (BZ#1516569)
HAproxy versions earlier than version 1.9 dropped new connections during a reload. This issue is now fixed. By using HAproxy’s seamless reload feature, HAproxy now passes open sockets when reloading, fixing reload issues. fixed. (BZ#1464657)
There was a spurious error in system logs. The error Stat fs failed. Error: no
such file or directory
appeared in logs frequently. This was because of calling
the syscall.Statfs
function in code when the path does not exist. This issue
is now fixed.
(BZ#1511576)
Previously, a reject routes error message showed up when using router shards. This issue is now fixed and the rejected routes error messages are now suppressed in HAproxy if router shards are used. (BZ#1491717)
Previously, if creating a route with the host set to localhost
, and if the
ROUTER_USE_PROXY_PROTOCOL
environment variable was not set to true
, any
route reloads would fail. This is because the hostname being set to the default
resulted in mismatches in route configurations. The -H
option is now available
when using curl
, meaning the health check does not pass the hostname when set
to 'localhost', and routes reload successfully.
(BZ#1542612)
Previously, updating TLS certificates was not possible for cluster administrators. Because it is an expected task of the cluster administrator, the role has been changed to update TLS certificates. (BZ#1524707)
Service Broker
Previously, the APBs for MariaDB, PostgreSQL, and MySQL were tagged as "databases" instead of "database". This is corrected with the tag "database" matching other services which is now properly shown in search results. (BZ#1510804)
Async bind and unbind is an experimental feature for the OpenShift Ansible broker (OAB) and is not supported or enabled by default. Red Hat’s officially released APBs (PostgreSQL, MariaDB, MySQL, and Mediawiki) also do not support async bind and unbind. (BZ#1548997)
Previously, the etcd server was not accessible when using the etcdctl
command.
This was caused by the tcp being set to “0.0.0.0” instead of the expected
--advertise-client-urls
value of the asb-etcd
deployment configuration. The
command had been updated and the etcd server is now accessible.
(BZ#1514417)
Previously, the apb push -o
command failed when using it outside the cluster.
This was because the Docker registry service of the desired service was set to
hit only the route used by internal operations. The appropriate Ansible playbook
has been updated to point to the appropriate route instead.
(BZ#1519193)
Previously, when typing asbd --help
or asbd -h
, the --help
argument returned
a code that was being misinterpreted as an error, resulting in errors printing
out twice. The fix corrects errors to only print once and also to interpret the
help command return code as valid. As a result, the help command now only prints
once. (BZ#1525817)
Previously, setting the white-list
variable in an RHCC registry would maintain
searching for any options, even after those options are removed from the
configuration. This was caused by an error in the white-list
code. The error
has been fixed by this bug.
(BZ#1526887)
Previously, if the registry configuration did not have auth_type
set to
config
error messages would appear. This bug ensures that registry
configurations work correctly without the auth_type
setting.
(BZ#1526949)
Previously, the broker would return a 400 status code when the user did not have the permissions to execute a task instead of the 403 status code. This bug fixes the error, and the correct status code is now returned. (BZ#1510486)
Previously, any MariaDB configuration options were displayed with MySQL options. This is because MariaDB uses MySQL variables upstream. This bug fix ensures that, in terms of OpenShift, the variables are called out as MariaDB. (BZ#1510294)
Storage
Previously, OpenShift checked mounted NFS volume with root squash. OpenShift permissions while running as root were squashed to the 'nobody' user, who did not have permissions to access mounted NFS volume. This caused any OpenShift checks to fail, and it did not unmount NFS volumes. Now, OpenShift does not access mounted NFS volumes, and checks for mounts by parsing /proc filesystem. NFS volumes with root squash option are unmounted. (BZ#1518237)
Previously, when a node that had an OpenStack Cinder type of persistent volume attached was shut down or crashed, the attached volume did not detach.Consequence: Because the persistent volume was unavailable, the pods did not migrate from the failed node, and the volumes were inaccessible from other nodes and pods. Now a node fails, all of its attached volumes are detached after a time-out. (BZ#1523142)
Previously, downward API, secrets, ConfigMap, and projected volumes fully managed their content and did not allow any other volumes to be mounted on top of them. This meant that users could not mount any volume on top of the aforementioned volumes. With this bug fix, the volumes now touch only the files they create. As a result, users can mount any volume on top of the aforementioned volumes. (BZ#1430322)
Upgrade
The upgrade playbooks did not previously regenerate the registry certificate when upgrading from releases prior to 3.6, which lacked the name 'docker-registry.default.svc'. As such, the configuration variables were not updated to push to the registry via DNS. The 3.9 upgrade playbooks now regenerate the certificate when needed, ensuring that all environments upgraded to 3.9 now push to the registry via DNS. (BZ#1519060)
The etcd host validation now accepts one or more etcd hosts, allowing greater flexibility in the number of etcd hosts configured. The recommended number of etcd hosts is still 3. (BZ#1506177)
Some features in this release are currently in Technology Preview. These experimental features are not intended for production use. Please note the following scope of support on the Red Hat Customer Portal for these features:
In the table below, features marked TP indicate Technology Preview and features marked GA indicate General Availability.
Feature | OCP 3.6 | OCP 3.7 | OCP 3.9 |
---|---|---|---|
- |
TP |
TP |
|
Local Storage Persistent Volumes |
- |
TP |
TP |
CRI-O for runtime pods |
- |
TP |
GA* [1] |
Tenant Driven Snapshotting |
- |
TP |
TP |
- |
TP |
TP |
|
Service Catalog |
TP |
GA |
- |
Template Service Broker |
TP |
GA |
- |
OpenShift Automation Broker |
TP |
GA |
- |
Network Policy |
TP |
GA |
- |
Service Catalog Initial Experience |
TP |
GA |
- |
New Add Project Flow |
TP |
GA |
- |
Search Catalog |
TP |
GA |
- |
CFME Installer |
TP |
GA |
- |
TP |
TP |
GA |
|
TP |
TP |
GA |
|
StatefulSets |
TP |
TP |
GA |
TP |
TP |
GA |
|
TP |
TP |
GA |
|
TP |
TP |
Dropped |
|
TP |
TP |
GA |
|
Hawkular Agent |
TP |
Dropped |
|
Pod PreSets |
TP |
Dropped |
|
- |
TP |
TP |
|
TP |
TP |
TP |
|
- |
TP |
GA |
|
- |
TP |
GA |
|
- |
TP |
GA |
|
TP |
TP |
GA |
|
- |
TP |
TP |
|
Clustered MongoDB Template |
TP |
Community |
- |
Clustered MySQL Template |
TP |
Community |
- |
TP |
TP |
GA |
|
- |
- |
TP |
|
- |
- |
TP |
|
- |
- |
TP |
|
- |
- |
TP |
|
- |
- |
TP |
|
- |
- |
TP |
There is a known issue in the initial GA release of OpenShift Container Platform 3.9 that
causes the installation and upgrade playbooks to consume more memory than
previous releases. The node scale-up and installation Ansible playbooks may have
consumed more memory on the control host (the system where you run the playbooks
from) than expected due to the use of include_tasks
in several places. This
issue has been addressed with the release of
RHBA-2018:0600; the
majority of these instances have now been converted to import_tasks
calls,
which do not consume as much memory. After this change, memory consumption on
the control host should be below 100MiB per host; for large environments (100+
hosts), a control host with at least 16GiB of memory is recommended.
(BZ#1558672)
Security, bug fix, and enhancement updates for OpenShift Container Platform 3.9 are released as asynchronous errata through the Red Hat Network. All OpenShift Container Platform 3.9 errata is available on the Red Hat Customer Portal. See the OpenShift Container Platform Life Cycle for more information about asynchronous errata.
Red Hat Customer Portal users can enable errata notifications in the account settings for Red Hat Subscription Management (RHSM). When errata notifications are enabled, users are notified via email whenever new errata relevant to their registered systems are released.
Red Hat Customer Portal user accounts must have systems registered and consuming OpenShift Container Platform entitlements for OpenShift Container Platform errata notification emails to generate. |
This section will continue to be updated over time to provide notes on enhancements and bug fixes for future asynchronous errata releases of OpenShift Container Platform 3.9. Versioned asynchronous releases, for example with the form OpenShift Container Platform 3.9.z, will be detailed in subsections. In addition, releases in which the errata text cannot fit in the space provided by the advisory will be detailed in subsections that follow.
For any OpenShift Container Platform release, always review the instructions on upgrading your cluster properly. |
*
indicate delivery in a z-stream patch.