Commit 66555b1d by Guangbo Chen

Basecopy hadoop v1.0.7 to v1.1.1

parent 380bd960
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*~
# Various IDEs
.project
.idea/
*.tmproj
apiVersion: v1
description: The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.
name: hadoop
version: 1.0.7
appVersion: 2.7.3
home: https://hadoop.apache.org/
sources:
- https://github.com/apache/hadoop
icon: http://hadoop.apache.org/images/hadoop-logo.jpg
maintainers:
- name: danisla
email: disla@google.com
# Hadoop Chart
[Hadoop](https://hadoop.apache.org/) is a framework for running large scale distributed applications.
This chart is primarily intended to be used for YARN and MapReduce job execution where HDFS is just used as a means to transport small artifacts within the framework and not for a distributed filesystem. Data should be read from cloud based datastores such as Google Cloud Storage, S3 or Swift.
## Chart Details
## Installing the Chart
To install the chart with the release name `hadoop` that utilizes 50% of the available node resources:
```
$ helm install --name hadoop $(stable/hadoop/tools/calc_resources.sh 50) stable/hadoop
```
> Note that you need at least 2GB of free memory per NodeManager pod, if your cluster isn't large enough, not all pods will be scheduled.
The optional [`calc_resources.sh`](./tools/calc_resources.sh) script is used as a convenience helper to set the `yarn.numNodes`, and `yarn.nodeManager.resources` appropriately to utilize all nodes in the Kubernetes cluster and a given percentage of their resources. For example, with a 3 node `n1-standard-4` GKE cluster and an argument of `50`, this would create 3 NodeManager pods claiming 2 cores and 7.5Gi of memory.
## Configuration
The following table lists the configurable parameters of the Hadoop chart and their default values.
| Parameter | Description | Default |
| ------------------------------------------------- | ------------------------------- | ---------------------------------------------------------------- |
| `image.repository` | Hadoop image ([source](https://github.com/Comcast/kube-yarn/tree/master/image)) | `danisla/hadoop` |
| `image.tag` | Version of hadoop libraries being used | `{VERSION}` |
| `image.pullPolicy` | Pull policy for the images | `IfNotPresent` |
| `antiAffinity` | Pod antiaffinity, `hard` or `soft` | `hard` |
| `hdfs.nameNode.pdbMinAvailable` | PDB for HDFS NameNode | `1` |
| `hdfs.nameNode.resources` | resources for the HDFS NameNode | `requests:memory=256Mi,cpu=10m,limits:memory=2048Mi,cpu=1000m` |
| `hdfs.dataNode.replicas` | Number of HDFS DataNode replicas | `1` |
| `hdfs.dataNode.pdbMinAvailable` | PDB for HDFS DataNode | `1` |
| `hdfs.dataNode.resources` | resources for the HDFS DataNode | `requests:memory=256Mi,cpu=10m,limits:memory=2048Mi,cpu=1000m` |
| `yarn.resourceManager.pdbMinAvailable` | PDB for the YARN ResourceManager | `1` |
| `yarn.resourceManager.resources` | resources for the YARN ResourceManager | `requests:memory=256Mi,cpu=10m,limits:memory=2048Mi,cpu=1000m` |
| `yarn.nodeManager.pdbMinAvailable` | PDB for the YARN NodeManager | `1` |
| `yarn.nodeManager.replicas` | Number of YARN NodeManager replicas | `2` |
| `yarn.nodeManager.parallelCreate` | Create all nodeManager statefulset pods in parallel (K8S 1.7+) | `false` |
| `yarn.nodeManager.resources` | Resource limits and requests for YARN NodeManager pods | `requests:memory=2048Mi,cpu=1000m,limits:memory=2048Mi,cpu=1000m`|
| `persistence.nameNode.enabled` | Enable/disable persistent volume | `false` |
| `persistence.nameNode.storageClass` | Name of the StorageClass to use per your volume provider | `-` |
| `persistence.nameNode.accessMode` | Access mode for the volume | `ReadWriteOnce` |
| `persistence.nameNode.size` | Size of the volume | `50Gi` |
| `persistence.dataNode.enabled` | Enable/disable persistent volume | `false` |
| `persistence.dataNode.storageClass` | Name of the StorageClass to use per your volume provider | `-` |
| `persistence.dataNode.accessMode` | Access mode for the volume | `ReadWriteOnce` |
| `persistence.dataNode.size` | Size of the volume | `200Gi` |
## Related charts
The [Zeppelin Notebook](https://github.com/kubernetes/charts/tree/master/stable/zeppelin) chart can use the hadoop config for the hadoop cluster and use the YARN executor:
```
helm install --set hadoop.useConfigMap=true stable/zeppelin
```
# References
- Original K8S Hadoop adaptation this chart was derived from: https://github.com/Comcast/kube-yarn
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*~
# Various IDEs
.project
.idea/
*.tmproj
apiVersion: v1
description: Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
name: zeppelin
version: 1.0.1
appVersion: 0.7.2
home: https://zeppelin.apache.org/
sources:
- https://github.com/apache/zeppelin
icon: https://zeppelin.apache.org/assets/themes/zeppelin/img/zeppelin_classic_logo.png
maintainers:
- name: danisla
email: disla@google.com
# Zeppelin Chart
[Zeppelin](https://zeppelin.apache.org/) is a web based notebook for interactive data analytics with Spark, SQL and Scala.
## Chart Details
## Configuration
The following table lists the configurable parameters of the Zeppelin chart and their default values.
| Parameter | Description | Default |
| ------------------------------------ | ----------------------------------------------------------------- | ---------------------------------------------------------- |
| `zeppelin.image` | Zeppelin image | `dylanmei/zeppelin:{VERSION}` |
| `zeppelin.resources` | Resource limits and requests | `limits.memory=4096Mi, limits.cpu=2000m` |
| `spark.driverMemory` | Memory used by [Spark driver](https://spark.apache.org/docs/latest/configuration.html#application-properties) (Java notation) | `1g` |
| `spark.executorMemory` | Memory used by [Spark executors](https://spark.apache.org/docs/latest/running-on-yarn.html) (Java notation) | `1g` |
| `spark.numExecutors` | Number of [Spark executors](https://spark.apache.org/docs/latest/running-on-yarn.html) | `2` |
| `hadoop.useConfigMap` | Use external Hadoop configuration for Spark executors | `false` |
| `hadoop.configMapName` | Name of the hadoop config map to use (must be in same namespace) | `hadoop-config` |
| `hadoop.configPath` | Path in the Zeppelin image where the Hadoop config is mounted | `/usr/hadoop-2.7.3/etc/hadoop` |
## Related charts
The [Hadoop](https://github.com/kubernetes/charts/tree/master/stable/hadoop) chart can be used to create a YARN cluster where Spark jobs are executed:
```
helm install -n hadoop stable/hadoop
helm install --set hadoop.useConfigMap=true,hadoop.configMapName=hadoop-hadoop stable/zeppelin
```
> Note that you may also want to set the `spark.numExecutors` value to match the number of yarn NodeManager replicas and the `executorMemory` value to half of the NodeManager memory limit.
questions:
- variable: defaultImage
default: "true"
description: "Use default Docker image"
label: Use Default Image
type: boolean
show_subquestion_if: false
group: "Container Images"
subquestions:
- variable: zeppelin.image.repository
default: "dylanmei/zeppelin"
description: "Zeppelin image name"
type: string
label: Zeppelin Image Name
- variable: zeppelin.image.tag
default: "0.7.2"
description: "Zeppelin image tag"
type: string
label: Zeppelin Image Tag
- variable: hadoop.useConfigMap
default: true
description: "Expose Zeppelin using Layer 7 Load Balancer - ingress"
type: boolean
label: Expose Zeppelin using Layer 7 Load Balancer
show_subquestion_if: true
group: "Hadoop Configuration"
required: true
- variable: ingress.enabled
default: "true"
description: "Expose Zeppelin using Layer 7 Load Balancer - ingress"
type: boolean
label: Expose Zeppelin using Layer 7 Load Balancer
show_subquestion_if: true
group: "Zeppelin"
required: true
subquestions:
- variable: ingress.hosts[0]
default: "xip.io"
description: "Hostname to your Zeppelin installation"
type: hostname
required: true
label: Hostname
- variable: server.type
default: "ClusterIP"
description: "yarn ui service type"
type: enum
group: "Zeppelin"
options:
- "ClusterIP"
- "NodePort"
required: true
label: Zeppelin Service Type
show_subquestion_if: "NodePort"
show_if: "ingress.enabled=false"
subquestions:
- variable: service.nodePort
default: ""
description: "NodePort http port(to set explicitly, choose port between 30000-32767)"
type: int
min: 30000
max: 32767
show_if: "ingress.enabled=false"
label: Zeppelin NodePort Number
1. Create a port-forward to the zeppelin pod:
kubectl port-forward -n {{ .Release.Namespace }} $(kubectl get pod -n {{ .Release.Namespace }} --selector=app={{ template "zeppelin.name" . }} -o jsonpath='{.items...metadata.name}') 8080:8080
Open browser to UI:
open http://localhost:8080
\ No newline at end of file
{{/* vim: set filetype=mustache: */}}
{{/*
Expand the name of the chart.
*/}}
{{- define "zeppelin.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 24 | trimSuffix "-" -}}
{{- end -}}
{{/*
Create a default fully qualified app name.
We truncate at 24 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
*/}}
{{- define "zeppelin.fullname" -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- printf "%s-%s" .Release.Name $name | trunc 24 | trimSuffix "-" -}}
{{- end -}}
\ No newline at end of file
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: {{ template "zeppelin.fullname" . }}
labels:
app: {{ template "zeppelin.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
spec:
strategy:
rollingUpdate:
maxUnavailable: 0
replicas: {{ .Values.zeppelin.replicas }}
template:
metadata:
labels:
app: {{ template "zeppelin.name" . }}
release: {{ .Release.Name }}
spec:
terminationGracePeriodSeconds: 0
containers:
- name: zeppelin
image: {{ .Values.zeppelin.image.repository }}:{{ .Values.zeppelin.image.tag }}
ports:
- containerPort: 8080
name: web
env:
- name: ZEPPELIN_PORT
value: "8080"
- name: ZEPPELIN_JAVA_OPTS
value: >-
-Dspark.driver.memory={{ .Values.spark.driverMemory }}
-Dspark.executor.memory={{ .Values.spark.executorMemory }}
{{- if .Values.hadoop.useConfigMap }}
- name: MASTER
value: "yarn"
- name: SPARK_SUBMIT_OPTIONS
value: >-
--deploy-mode client
--num-executors {{ .Values.spark.numExecutors }}
{{- end }}
volumeMounts:
{{- if .Values.hadoop.useConfigMap }}
- mountPath: {{ .Values.hadoop.configPath }}
name: hadoop-config
{{- end }}
resources:
{{ toYaml .Values.zeppelin.resources | indent 12 }}
readinessProbe:
httpGet:
path: /
port: 8080
initialDelaySeconds: 20
timeoutSeconds: 1
{{- if .Values.hadoop.useConfigMap }}
volumes:
- name: hadoop-config
configMap:
{{- if .Values.hadoop.configMapName }}
name: {{ .Values.hadoop.configMapName }}
{{- else }}
name: {{ .Release.Name }}-hadoop
{{- end }}
{{- end }}
{{- if .Values.ingress.enabled -}}
{{- $serviceName := include "zeppelin.fullname" . }}
{{- $routePrefix := .Values.routePrefix }}
{{- $releaseName := .Release.Name }}
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: {{ template "zeppelin.fullname" . }}
labels:
app: {{ template "zeppelin.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
{{- if .Values.ingress.labels }}
{{ toYaml .Values.ingress.labels | indent 4 }}
{{- end }}
{{- if .Values.ingress.annotations }}
annotations:
{{ toYaml .Values.ingress.annotations | indent 4 }}
{{- end }}
spec:
rules:
{{- range $host := .Values.ingress.hosts }}
- host: {{ $host }}
http:
paths:
- path: "{{ $routePrefix }}"
backend:
serviceName: {{ $serviceName }}
servicePort: 8080
{{- end -}}
{{- if .Values.ingress.tls }}
tls:
{{ toYaml .Values.ingress.tls | indent 4 }}
{{- end }}
{{- end }}
apiVersion: v1
kind: Service
metadata:
name: {{ template "zeppelin.fullname" . }}
labels:
app: {{ template "zeppelin.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
annotations:
{{- if .Values.service.annotations }}
{{ toYaml .Values.service.annotations | indent 4 }}
{{- end }}
spec:
type: {{ .Values.service.type }}
ports:
- port: 8080
name: web
targetPort: 8080
{{- if .Values.service.nodePort }}
nodePort: {{ .Values.service.nodePort }}
{{- end }}
selector:
app: {{ template "zeppelin.name" . }}
release: {{ .Release.Name }}
{{- if contains "LoadBalancer" .Values.service.type }}
{{- if .Values.service.loadBalancerIP }}
loadBalancerIP: {{ .Values.service.loadBalancerIP }}
{{- end -}}
{{- if .Values.service.loadBalancerSourceRanges}}
loadBalancerSourceRanges:
{{- range .Values.service.loadBalancerSourceRanges }}
- {{ . }}
{{- end }}
{{- end -}}
{{- end -}}
zeppelin:
zeppelin:
replicas: 1
image:
repository: dylanmei/zeppelin
tag: 0.7.2
resources:
limits:
memory: "4096Mi"
cpu: "2000m"
requests:
memory: "512Mi"
cpu: "200m"
hadoop:
useConfigMap: true
# configMapName: hadoop-hadoop
configPath: /usr/hadoop-2.7.3/etc/hadoop
spark:
driverMemory: 1g
executorMemory: 1g
numExecutors: 2
ingress:
## If true, Grafana Ingress will be created
##
enabled: false
## Annotations for Alertmanager Ingress
##
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
## Labels to be added to the Ingress
##
labels: {}
## Hostnames.
## Must be provided if Ingress is enabled.
##
# hosts:
# - grafana.domain.com
hosts: []
## TLS configuration for Alertmanager Ingress
## Secret must be manually created in the namespace
##
tls: []
# - secretName: alertmanager-general-tls
# hosts:
# - alertmanager.example.com
service:
## Annotations to be added to the Service
##
annotations: {}
## Cluster-internal IP address for Alertmanager Service
##
clusterIP: ""
## List of external IP addresses at which the Alertmanager Service will be available
##
externalIPs: []
## External IP address to assign to Alertmanager Service
## Only used if service.type is 'LoadBalancer' and supported by cloud provider
##
loadBalancerIP: ""
## List of client IPs allowed to access Alertmanager Service
## Only used if service.type is 'LoadBalancer' and supported by cloud provider
##
loadBalancerSourceRanges: []
## Port to expose on each node
## Only used if service.type is 'NodePort'
##
# nodePort: 30902
## Service type
##
type: ClusterIP
categories:
- hadoop
questions:
- variable: defaultImage
default: "true"
description: "Use default Docker image"
label: Use Default Image
type: boolean
show_subquestion_if: false
group: "Container Images"
subquestions:
- variable: image.repository
default: "danisla/hadoop"
description: "Hadoop image name"
type: string
label: Hadoop Image Name
- variable: image.tag
default: "2.7.3"
description: "Hadoop image tag"
type: string
label: Hadoop Image Tag
- variable: zeppelin.zeppelin.image.repository
default: "dylanmei/zeppelin"
description: "Zeppelin image name"
type: string
label: Zeppelin Image Name
- variable: zeppelin.zeppelin.image.tag
default: "0.7.2"
description: "Zeppelin image tag"
type: string
label: Zeppelin Image Tag
- variable: yarn.nodeManager.replicas
default: 2
description: "The number of YARN NodeManager instances"
type: int
min: 2
max: 100
group: "YARN Node Manager"
- variable: yarn.nodeManager.resources.requests.cpu
default: "1000m"
description: "The CPU resources allocated to each node manager pod, 1cpu=1000m"
label: "The CPU resources allocated to each node manager pod"
group: "YARN Node Manager"
type: string
required: true
- variable: yarn.nodeManager.resources.requests.memory
default: "2048Mi"
description: "The memory resources allocated to each node manager pod, 1Gi=1024Mi"
label: "The memory resources allocated to each node manager pod"
type: string
group: "YARN Node Manager"
required: true
- variable: persistence.nameNode.enabled
default: "false"
description: "Enable persistent volume for name node"
type: boolean
required: true
label: NameNode Persistent Volume Enabled
show_subquestion_if: true
group: "YARN NameNode"
subquestions:
- variable: persistence.nameNode.size
default: "50Gi"
description: "NameNode Persistent Volume Size"
type: string
label: NameNode Volume Size
required: true
- variable: persistence.nameNode.storageClass
default: ""
description: "If undefined or set to null, using the default storageClass. Defaults to null."
type: storageclass
label: Storage Class for NameNode
- variable: persistence.dataNode.enabled
default: "false"
description: "Enable persistent volume for DataNode"
type: boolean
required: true
label: DataNode Persistent Volume Enabled
show_subquestion_if: true
group: "YARN DataNode"
subquestions:
- variable: persistence.dataNode.size
default: "50Gi"
description: "DataNode Persistent Volume Size"
type: string
label: DataNode Volume Size
required: true
- variable: persistence.dataNode.storageClass
default: ""
description: "If undefined or set to null, using the default storageClass. Defaults to null."
type: storageclass
label: Storage Class for DataNode
- variable: yarn.ingress.enabled
default: "true"
description: "Expose YARN UI using Layer 7 Load Balancer - ingress"
type: boolean
label: Expose YARN UI using Layer 7 Load Balancer
show_subquestion_if: true
group: "YARN UI"
required: true
subquestions:
- variable: yarn.ingress.hosts[0]
default: "xip.io"
description: "Hostname to your YARN UI installation"
type: hostname
required: true
label: Hostname
- variable: yarn.service.type
default: "NodePort"
description: "yarn ui service type"
type: enum
group: "YARN UI"
options:
- "ClusterIP"
- "NodePort"
required: true
label: YARN UI Service Type
show_subquestion_if: "NodePort"
show_if: "yarn.ingress.enabled=false"
subquestions:
- variable: yarn.service.nodePort
default: ""
description: "NodePort http port(to set explicitly, choose port between 30000-32767)"
type: int
min: 30000
max: 32767
show_if: "yarn.ingress.enabled=false"
label: YARN UI NodePort Number
# zeppelin configurations
- variable: zeppelin.enabled
default: true
description: "Enabled zeppelin chart"
type: boolean
label: Enabled Zeppelin Chart
group: "Zeppelin"
- variable: zeppelin.ingress.enabled
default: "true"
description: "Expose Zeppelin using Layer 7 Load Balancer - ingress"
type: boolean
label: Expose Zeppelin using Layer 7 Load Balancer
show_subquestion_if: true
group: "Zeppelin"
required: true
show_if: "zeppelin.enabled=true"
subquestions:
- variable: zeppelin.ingress.hosts[0]
default: "xip.io"
description: "Hostname to your Zeppelin installation"
type: hostname
required: true
label: Hostname
show_if: "zeppelin.enabled=true"
- variable: zeppelin.service.type
default: "NodePort"
description: "yarn ui service type"
type: enum
group: "Zeppelin"
options:
- "ClusterIP"
- "NodePort"
required: true
label: Zeppelin Service Type
show_subquestion_if: "NodePort"
show_if: "ingress.enabled=false&&zeppelin.enabled=true"
subquestions:
- variable: zeppelin.service.nodePort
default: ""
description: "NodePort http port(to set explicitly, choose port between 30000-32767)"
type: int
min: 30000
max: 32767
show_if: "ingress.enabled=false"
label: Zeppelin NodePort Number
dependencies:
- name: zeppelin
version: "1.0.1"
condition: zeppelin.enabled
1. You can check the status of HDFS by running this command:
kubectl exec -n {{ .Release.Namespace }} -it {{ template "hadoop.fullname" . }}-hdfs-nn-0 -- /usr/local/hadoop/bin/hdfs dfsadmin -report
2. You can list the yarn nodes by running this command:
kubectl exec -n {{ .Release.Namespace }} -it {{ template "hadoop.fullname" . }}-yarn-rm-0 -- /usr/local/hadoop/bin/yarn node -list
3. Create a port-forward to the yarn resource manager UI:
kubectl port-forward -n {{ .Release.Namespace }} {{ template "hadoop.fullname" . }}-yarn-rm-0 8088:8088
Then open the ui in your browser:
open http://localhost:8088
4. You can run included hadoop tests like this:
kubectl exec -n {{ .Release.Namespace }} -it {{ template "hadoop.fullname" . }}-yarn-nm-0 -- /usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-{{ .Values.image.tag }}-tests.jar TestDFSIO -write -nrFiles 5 -fileSize 128MB -resFile /tmp/TestDFSIOwrite.txt
5. You can list the mapreduce jobs like this:
kubectl exec -n {{ .Release.Namespace }} -it {{ template "hadoop.fullname" . }}-yarn-rm-0 -- /usr/local/hadoop/bin/mapred job -list
6. This chart can also be used with the zeppelin chart
helm install --namespace {{ .Release.Namespace }} --set hadoop.useConfigMap=true,hadoop.configMapName={{ template "hadoop.fullname" . }} stable/zeppelin
7. You can scale the number of yarn nodes like this:
helm upgrade {{ .Release.Name }} --set yarn.nodeManager.replicas=4 stable/hadoop
Make sure to update the values.yaml if you want to make this permanent.
{{/* vim: set filetype=mustache: */}}
{{/*
Expand the name of the chart.
*/}}
{{- define "hadoop.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
*/}}
{{- define "hadoop.fullname" -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: {{ template "hadoop.fullname" . }}-hdfs-dn
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: hdfs-dn
spec:
selector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: hdfs-dn
minAvailable: {{ .Values.hdfs.dataNode.pdbMinAvailable }}
\ No newline at end of file
{{- if .Values.persistence.dataNode.enabled -}}
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: {{ template "hadoop.fullname" . }}-hdfs-dn
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: hdfs-dn
spec:
accessModes:
- {{ .Values.persistence.dataNode.accessMode | quote }}
resources:
requests:
storage: {{ .Values.persistence.dataNode.size | quote }}
{{- if .Values.persistence.dataNode.storageClass }}
{{- if (eq "-" .Values.persistence.dataNode.storageClass) }}
storageClassName: ""
{{- else }}
storageClassName: "{{ .Values.persistence.dataNode.storageClass }}"
{{- end }}
{{- end }}
{{- end -}}
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: {{ template "hadoop.fullname" . }}-hdfs-dn
annotations:
checksum/config: {{ include (print $.Template.BasePath "/hadoop-configmap.yaml") . | sha256sum }}
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: hdfs-dn
spec:
serviceName: {{ template "hadoop.fullname" . }}-hdfs-dn
replicas: {{ .Values.hdfs.dataNode.replicas }}
template:
metadata:
labels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: hdfs-dn
spec:
affinity:
podAntiAffinity:
{{- if eq .Values.antiAffinity "hard" }}
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name | quote }}
component: hdfs-dn
{{- else if eq .Values.antiAffinity "soft" }}
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 5
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name | quote }}
component: hdfs-dn
{{- end }}
terminationGracePeriodSeconds: 0
containers:
- name: hdfs-dn
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
command:
- "/bin/bash"
- "/tmp/hadoop-config/bootstrap.sh"
- "-d"
resources:
{{ toYaml .Values.hdfs.dataNode.resources | indent 10 }}
readinessProbe:
httpGet:
path: /
port: 50075
initialDelaySeconds: 5
timeoutSeconds: 2
livenessProbe:
httpGet:
path: /
port: 50075
initialDelaySeconds: 10
timeoutSeconds: 2
volumeMounts:
- name: hadoop-config
mountPath: /tmp/hadoop-config
- name: dfs
mountPath: /root/hdfs/datanode
volumes:
- name: hadoop-config
configMap:
name: {{ template "hadoop.fullname" . }}
- name: dfs
{{- if .Values.persistence.dataNode.enabled }}
persistentVolumeClaim:
claimName: {{ template "hadoop.fullname" . }}-hdfs-dn
{{- else }}
emptyDir: {}
{{- end }}
# A headless service to create DNS records
apiVersion: v1
kind: Service
metadata:
name: {{ template "hadoop.fullname" . }}-hdfs-dn
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: hdfs-dn
spec:
ports:
- name: dfs
port: 9000
protocol: TCP
- name: webhdfs
port: 50075
clusterIP: None
selector:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: hdfs-dn
\ No newline at end of file
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: {{ template "hadoop.fullname" . }}-hdfs-nn
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: hdfs-nn
spec:
selector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: hdfs-nn
minAvailable: {{ .Values.hdfs.nameNode.pdbMinAvailable }}
\ No newline at end of file
{{- if .Values.persistence.nameNode.enabled -}}
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: {{ template "hadoop.fullname" . }}-hdfs-nn
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: hdfs-nn
spec:
accessModes:
- {{ .Values.persistence.nameNode.accessMode | quote }}
resources:
requests:
storage: {{ .Values.persistence.nameNode.size | quote }}
{{- if .Values.persistence.nameNode.storageClass }}
{{- if (eq "-" .Values.persistence.nameNode.storageClass) }}
storageClassName: ""
{{- else }}
storageClassName: "{{ .Values.persistence.nameNode.storageClass }}"
{{- end }}
{{- end }}
{{- end -}}
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: {{ template "hadoop.fullname" . }}-hdfs-nn
annotations:
checksum/config: {{ include (print $.Template.BasePath "/hadoop-configmap.yaml") . | sha256sum }}
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: hdfs-nn
spec:
serviceName: {{ template "hadoop.fullname" . }}-hdfs-nn
replicas: 1
template:
metadata:
labels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: hdfs-nn
spec:
affinity:
podAntiAffinity:
{{- if eq .Values.antiAffinity "hard" }}
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name | quote }}
component: hdfs-nn
{{- else if eq .Values.antiAffinity "soft" }}
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 5
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name | quote }}
component: hdfs-nn
{{- end }}
terminationGracePeriodSeconds: 0
containers:
- name: hdfs-nn
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
command:
- "/bin/bash"
- "/tmp/hadoop-config/bootstrap.sh"
- "-d"
resources:
{{ toYaml .Values.hdfs.nameNode.resources | indent 10 }}
readinessProbe:
httpGet:
path: /
port: 50070
initialDelaySeconds: 5
timeoutSeconds: 2
livenessProbe:
httpGet:
path: /
port: 50070
initialDelaySeconds: 10
timeoutSeconds: 2
volumeMounts:
- name: hadoop-config
mountPath: /tmp/hadoop-config
- name: dfs
mountPath: /root/hdfs/namenode
volumes:
- name: hadoop-config
configMap:
name: {{ template "hadoop.fullname" . }}
- name: dfs
{{- if .Values.persistence.nameNode.enabled }}
persistentVolumeClaim:
claimName: {{ template "hadoop.fullname" . }}-hdfs-nn
{{- else }}
emptyDir: {}
{{- end }}
# A headless service to create DNS records
apiVersion: v1
kind: Service
metadata:
name: {{ template "hadoop.fullname" . }}-hdfs-nn
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: hdfs-nn
spec:
ports:
- name: dfs
port: 9000
protocol: TCP
- name: webhdfs
port: 50070
clusterIP: None
selector:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: hdfs-nn
\ No newline at end of file
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: {{ template "hadoop.fullname" . }}-yarn-nm
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: yarn-nm
spec:
selector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: yarn-nm
minAvailable: {{ .Values.yarn.nodeManager.pdbMinAvailable }}
\ No newline at end of file
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: {{ template "hadoop.fullname" . }}-yarn-nm
annotations:
checksum/config: {{ include (print $.Template.BasePath "/hadoop-configmap.yaml") . | sha256sum }}
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: yarn-nm
spec:
podManagementPolicy: Parallel
serviceName: {{ template "hadoop.fullname" . }}-yarn-nm
replicas: {{ .Values.yarn.nodeManager.replicas }}
{{- if .Values.yarn.nodeManager.parallelCreate }}
podManagementPolicy: Parallel
{{- end }}
template:
metadata:
labels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: yarn-nm
spec:
affinity:
podAntiAffinity:
{{- if eq .Values.antiAffinity "hard" }}
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name | quote }}
component: yarn-nm
{{- else if eq .Values.antiAffinity "soft" }}
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 5
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name | quote }}
component: yarn-nm
{{- end }}
terminationGracePeriodSeconds: {{ .Values.terminationGracePeriodSeconds }}
containers:
- name: yarn-nm
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: 8088
name: web
command:
- "/bin/bash"
- "/tmp/hadoop-config/bootstrap.sh"
- "-d"
resources:
{{ toYaml .Values.yarn.nodeManager.resources | indent 10 }}
readinessProbe:
httpGet:
path: /node
port: 8042
initialDelaySeconds: 10
timeoutSeconds: 2
livenessProbe:
httpGet:
path: /node
port: 8042
initialDelaySeconds: 10
timeoutSeconds: 2
env:
- name: MY_CPU_LIMIT
valueFrom:
resourceFieldRef:
containerName: yarn-nm
resource: limits.cpu
divisor: 1
- name: MY_MEM_LIMIT
valueFrom:
resourceFieldRef:
containerName: yarn-nm
resource: limits.memory
divisor: 1M
volumeMounts:
- name: hadoop-config
mountPath: /tmp/hadoop-config
volumes:
- name: hadoop-config
configMap:
name: {{ template "hadoop.fullname" . }}
# A headless service to create DNS records
apiVersion: v1
kind: Service
metadata:
name: {{ template "hadoop.fullname" . }}-yarn-nm
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: yarn-nm
spec:
ports:
- port: 8088
name: web
- port: 8082
name: web2
- port: 8042
name: api
clusterIP: None
selector:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: yarn-nm
\ No newline at end of file
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: {{ template "hadoop.fullname" . }}-yarn-rm
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: yarn-rm
spec:
selector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: yarn-rm
minAvailable: {{ .Values.yarn.resourceManager.pdbMinAvailable }}
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: {{ template "hadoop.fullname" . }}-yarn-rm
annotations:
checksum/config: {{ include (print $.Template.BasePath "/hadoop-configmap.yaml") . | sha256sum }}
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: yarn-rm
spec:
serviceName: {{ template "hadoop.fullname" . }}-yarn-rm
replicas: 1
template:
metadata:
labels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: yarn-rm
spec:
affinity:
podAntiAffinity:
{{- if eq .Values.antiAffinity "hard" }}
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name | quote }}
component: yarn-rm
{{- else if eq .Values.antiAffinity "soft" }}
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 5
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name | quote }}
component: yarn-rm
{{- end }}
terminationGracePeriodSeconds: 0
containers:
- name: yarn-rm
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: 8088
name: web
command:
- "/bin/bash"
- "/tmp/hadoop-config/bootstrap.sh"
- "-d"
resources:
{{ toYaml .Values.yarn.resourceManager.resources | indent 10 }}
readinessProbe:
httpGet:
path: /ws/v1/cluster/info
port: 8088
initialDelaySeconds: 5
timeoutSeconds: 2
livenessProbe:
httpGet:
path: /ws/v1/cluster/info
port: 8088
initialDelaySeconds: 10
timeoutSeconds: 2
volumeMounts:
- name: hadoop-config
mountPath: /tmp/hadoop-config
volumes:
- name: hadoop-config
configMap:
name: {{ template "hadoop.fullname" . }}
# A headless service to create DNS records
apiVersion: v1
kind: Service
metadata:
name: {{ template "hadoop.fullname" . }}-yarn-rm
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: yarn-rm
spec:
ports:
- port: 8088
name: web
clusterIP: None
selector:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: yarn-rm
\ No newline at end of file
{{- if .Values.yarn.ingress.enabled -}}
{{- $serviceName := include "hadoop.fullname" . }}
{{- $routePrefix := .Values.yarn.routePrefix }}
{{- $releaseName := .Release.Name }}
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
{{- if .Values.yarn.ingress.annotations }}
annotations:
{{ toYaml .Values.yarn.ingress.annotations | indent 4 }}
{{- end }}
labels:
app: {{ template "hadoop.fullname" . }}
chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
heritage: "{{ .Release.Service }}"
release: "{{ .Release.Name }}"
{{- if .Values.yarn.ingress.labels }}
{{ toYaml .Values.yarn.ingress.labels | indent 4 }}
{{- end }}
name: {{ template "hadoop.fullname" . }}
spec:
rules:
{{- range $host := .Values.yarn.ingress.hosts }}
- host: {{ $host }}
http:
paths:
- path: "{{ $routePrefix }}"
backend:
serviceName: {{ printf "%s-%s" $serviceName "yarn-ui" }}
servicePort: 8088
{{- end -}}
{{- if .Values.yarn.ingress.tls }}
tls:
{{ toYaml .Values.yarn.ingress.tls | indent 4 }}
{{- end }}
{{- end }}
# Service to access the yarn web ui
apiVersion: v1
kind: Service
metadata:
name: {{ template "hadoop.fullname" . }}-yarn-ui
labels:
app: {{ template "hadoop.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: yarn-ui
spec:
type: {{ .Values.yarn.service.type }}
ports:
- name: web
port: 8088
targetPort: web
{{- if (and (eq .Values.yarn.service.type "NodePort") (not (empty .Values.yarn.service.nodePort)))}}
nodePort: {{ .Values.yarn.service.nodePort }}
{{- end }}
selector:
app: {{ template "hadoop.name" . }}
release: {{ .Release.Name }}
component: yarn-rm
#!/bin/bash
# Calculates cluster resources given a percentage based on what is currently allocatable.
# Related issue to programmatic resource query: https://github.com/kubernetes/kubernetes/issues/27404
TARGET_PCT=$1
[[ -z "${TARGET_PCT}" ]] && echo "USAGE: $0 <target percent>" && exit 1
NODES=$(kubectl get nodes -o jsonpath='{.items..metadata.name}')
NUM_NODES=$(echo "${NODES}" | tr ' ' '\n' | wc -l | xargs echo -n)
TOTAL_CPU=$(kubectl get nodes -o jsonpath='{.items[0].status.allocatable.cpu}')
# Convert CPU to nanocores
TOTAL_CPU=$(bc <<< "${TOTAL_CPU} * 1000000000")
# Start kube proxy to get to node stats summary api
kubectl proxy >/dev/null 2>&1 &
export kproxy=%1
# Cleanup kproxy on exit
function finish {
kill $kproxy
}
trap finish EXIT
# Wait for proxy
(while [[ $count -lt 5 && -z "$(curl -s localhost:8001/api/v1)" ]]; do ((count=count+1)) ; sleep 2; done && [[ $count -lt 5 ]])
[[ $? -ne 0 ]] && echo "ERROR: could not start kube proxy to fetch node stats summary" && exit 1
declare -a NODE_STATS
declare -a AVAIL_CPU
declare -a AVAIL_MEM
i=0
for NODE in ${NODES}; do
NODE_STATS[$i]=$(curl -sf localhost:8001/api/v1/proxy/nodes/${NODE}:10255/stats/summary)
[[ $? -ne 0 ]] && echo "ERROR: Could not get stats summary for node: ${NODE}" && exit 1
# Get available memory
AVAIL_MEM[$i]=$(jq '.node.memory.availableBytes' <<< "${NODE_STATS[$i]}")
AVAIL_MEM[$i]=$(bc -l <<< "scale=0; ${AVAIL_MEM[$i]}/1024/1024")
# Derive available CPU
USED_CPU=$(jq '.node.cpu.usageNanoCores' <<< "${NODE_STATS[$i]}")
AVAIL_CPU[$i]=$(bc -l <<< "scale=2; (${TOTAL_CPU} - ${USED_CPU})/1000000")
((i=i+1))
done
# Optimize per the min resources on any node.
CORES=$(echo "${AVAIL_CPU[*]}" | tr ' ' '\n' | sort -n | head -1)
MEMORY=$(echo "${AVAIL_MEM[*]}" | tr ' ' '\n' | sort -n | head -1)
# Subtract resources used by the chart. Note these are default values.
HADOOP_SHARE_CPU=400
CORES=$(bc -l <<< "scale=0; (${CORES} - ${HADOOP_SHARE_CPU})")
HADOOP_SHARE_MEM=1024
MEMORY=$(bc -l <<< "scale=0; (${MEMORY} - ${HADOOP_SHARE_MEM})")
CPU_PER_NODE=$(bc -l <<< "scale=2; (${CORES} * ${TARGET_PCT}/100)")
MEM_PER_NODE=$(bc -l <<< "scale=2; (${MEMORY} * ${TARGET_PCT}/100)")
# Round cpu to lower mCPU
CPU_PER_NODE=$(bc -l <<< "scale=0; ${CPU_PER_NODE} - (${CPU_PER_NODE} % 10)")
# Round mem to lower Mi
MEM_PER_NODE=$(bc -l <<< "scale=0; ${MEM_PER_NODE} - (${MEM_PER_NODE} % 100)")
[[ "${CPU_PER_NODE/%.*/}" -lt 100 ]] && echo "WARN: Insufficient available CPU for scheduling" >&2
[[ "${MEM_PER_NODE/%.*/}" -lt 2048 ]] && MEM_PER_NODE=2048.0 && echo "WARN: Insufficient available Memory for scheduling" >&2
CPU_LIMIT=${CPU_PER_NODE/%.*/m}
MEM_LIMIT=${MEM_PER_NODE/%.*/Mi}
echo -n "--set yarn.nodeManager.replicas=${NUM_NODES},yarn.nodeManager.resources.requests.cpu=${CPU_LIMIT},yarn.nodeManager.resources.requests.memory=${MEM_LIMIT},yarn.nodeManager.resources.limits.cpu=${CPU_LIMIT},yarn.nodeManager.resources.limits.memory=${MEM_LIMIT}"
# The base hadoop image to use for all components.
# See this repo for image build details: https://github.com/Comcast/kube-yarn/tree/master/image
image:
repository: danisla/hadoop
tag: 2.7.3
pullPolicy: IfNotPresent
# Select antiAffinity as either hard or soft, default is hard
antiAffinity: "soft"
terminationGracePeriodSeconds: 30 # Duration in seconds a pod needs to terminate gracefully.
hdfs:
nameNode:
pdbMinAvailable: 1
resources:
requests:
memory: "256Mi"
cpu: "10m"
limits:
memory: "2048Mi"
cpu: "1000m"
dataNode:
replicas: 1
pdbMinAvailable: 1
resources:
requests:
memory: "256Mi"
cpu: "10m"
limits:
memory: "2048Mi"
cpu: "1000m"
yarn:
resourceManager:
pdbMinAvailable: 1
resources:
requests:
memory: "256Mi"
cpu: "10m"
limits:
memory: "2048Mi"
cpu: "2000m"
nodeManager:
pdbMinAvailable: 1
# The number of YARN NodeManager instances.
replicas: 2
# Create statefulsets in parallel (K8S 1.7+)
parallelCreate: false
# CPU and memory resources allocated to each node manager pod.
# This should be tuned to fit your workload.
resources:
requests:
memory: "2048Mi"
cpu: "1000m"
limits:
memory: "2048Mi"
cpu: "1000m"
service:
type: ClusterIP
# nodePort: 32000 #range form 32000-32767
ingress:
## If true, Grafana Ingress will be created
##
enabled: false
## Annotations for Alertmanager Ingress
##
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
## Labels to be added to the Ingress
##
labels: {}
## Hostnames.
## Must be provided if Ingress is enabled.
##
# hosts:
# - grafana.domain.com
hosts: []
## TLS configuration for Alertmanager Ingress
## Secret must be manually created in the namespace
##
tls: []
# - secretName: alertmanager-general-tls
# hosts:
# - alertmanager.example.com
persistence:
nameNode:
enabled: false
storageClass: "-"
accessMode: ReadWriteOnce
size: 50Gi
dataNode:
enabled: false
storageClass: "-"
accessMode: ReadWriteOnce
size: 200Gi
zeppelin:
enabled: true
#zeppelin configurations
zeppelin:
replicas: 1
image:
repository: dylanmei/zeppelin
tag: 0.7.2
resources:
limits:
memory: "4096Mi"
cpu: "2000m"
requests:
memory: "512Mi"
cpu: "200m"
hadoop:
useConfigMap: true
# configMapName: hadoop-hadoop
configPath: /usr/hadoop-2.7.3/etc/hadoop
spark:
driverMemory: 1g
executorMemory: 1g
numExecutors: 2
ingress:
## If true, Grafana Ingress will be created
##
enabled: false
## Annotations for Alertmanager Ingress
##
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
## Labels to be added to the Ingress
##
labels: {}
## Hostnames.
## Must be provided if Ingress is enabled.
##
# hosts:
# - grafana.domain.com
hosts: []
## TLS configuration for Alertmanager Ingress
## Secret must be manually created in the namespace
##
tls: []
# - secretName: alertmanager-general-tls
# hosts:
# - alertmanager.example.com
service:
## Annotations to be added to the Service
##
annotations: {}
## Cluster-internal IP address for Alertmanager Service
##
clusterIP: ""
## List of external IP addresses at which the Alertmanager Service will be available
##
externalIPs: []
## External IP address to assign to Alertmanager Service
## Only used if service.type is 'LoadBalancer' and supported by cloud provider
##
loadBalancerIP: ""
## List of client IPs allowed to access Alertmanager Service
## Only used if service.type is 'LoadBalancer' and supported by cloud provider
##
loadBalancerSourceRanges: []
## Port to expose on each node
## Only used if service.type is 'NodePort'
##
# nodePort: 30902
## Service type
##
type: ClusterIP
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment