Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [request]: VPC endpoint support for EKS API #298

Closed
tdmalone opened this issue May 20, 2019 · 30 comments
Closed

[EKS] [request]: VPC endpoint support for EKS API #298

tdmalone opened this issue May 20, 2019 · 30 comments
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@tdmalone
Copy link

tdmalone commented May 20, 2019

Tell us about your request
VPC endpoint support for EKS, so that worker nodes that can register with an EKS-managed cluster without requiring outbound internet access.

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Worker nodes based on the EKS AMI run bootstrap.sh to connect themselves to the cluster. As part of this process, aws eks describe-cluster is called, which currently requires outbound internet access.

I'd love to be able to turn off outbound internet access but still easily bootstrap worker nodes without providing additional configuration.

Are you currently working around this issue?

  • Providing outbound internet access to worker nodes; OR
  • Supplying the cluster CA and API endpoint directly to bootstrap.sh.

Additional context

@tdmalone tdmalone added the Proposed Community submitted issue label May 20, 2019
@tabern tabern added the EKS Amazon Elastic Kubernetes Service label May 21, 2019
@devonkinghorn
Copy link

Is there any news on this?

@michael-burt
Copy link

Any updates on this issue?

@mikestef9
Copy link
Contributor

If you use EKS Managed Nodes, the bootstrapping process avoids the aws eks describe-cluster API call, so you can launch workers into a private subnet without outbound internet access as long as you setup the other required PrivateLink endpoints correctly.

@michael-burt
Copy link

Thanks Mike. Unfortunately managed nodes are not an option because they cannot be scaled to 0. We run some machine learning workloads that require scaling up ASGs with expensive VMs (x1.32xlarge) and we need to be able to scale them back to 0 once the workloads have completed.

@mikestef9
Copy link
Contributor

Thanks for the feedback. Can you open a separate GH issue with that feature request for Managed Node Groups?

Will keep this issue open as it's something we are researching.

@dsw88
Copy link

dsw88 commented Jan 28, 2020

@mikestef9 I'm interested in the managed nodes solution. What do you mean by "you can launch workers into a private subnet without outbound internet access as long as you setup the other required PrivateLink endpoints correctly"?

Which PrivateLink endpoints are you referring to? Just the other service endpoints such as SQS and SNS that the applications running on the cluster may happen to use? Or do you mean that there are particular PrivateLink endpoints required to run EKS in private subnets with no internet gateway?

@mikestef9
Copy link
Contributor

mikestef9 commented Jan 28, 2020

Hi @dsw88,

In order for the worker node to join the cluster, you will need to configure VPC endpoints for ECR, EC2, and S3

See this GH repo https://github.com/jpbarto/private-eks-cluster created by an AWS Solutions Architect for a reference implementation. Note that only 1.13 and above EKS clusters have a kubelet version that is compatible with the ECR VPC endpoint.

@dsw88
Copy link

dsw88 commented Feb 3, 2020

@mikestef9 Thanks so much for the info, and thanks for the pointer to the private EKS cluster reference repository!

I have one final question that I'm having a hard time figuring out how to deal with: How can I configure other hosts in this same private VPC to be able to talk to the cluster? Knowing the private DNS name isn't a huge deal, because I can just hard-code it into whatever needs to talk to the cluster. A bigger problem, however, is how a host in the private VPC can authenticate with the cluster.

Currently when I use the AWS API to set up a kubeconfig with EKS, it includes the following snippet in the generated kubeconfig file:

- name: arn:aws:eks:REGION:ACCOUNT_ID:cluster/CLUSTER_NAME
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      args:
      - --region
      - REGION
      - eks
      - get-token
      - --cluster-name
      - CLUSTER_NAME
      command: aws
      env: null

As you can see, it called the EKS API to get a token that authenticates it with the cluster. That definitely presents a problem since my hosts in the private VPC also don't have access to the EKS API. Is there another way that I can authenticate to the cluster without EKS API access?

@zucler
Copy link

zucler commented Feb 7, 2020

See this GH repo https://github.com/jpbarto/private-eks-cluster created by an AWS Solutions Architect for a reference implementation. Note that only 1.13 and above EKS clusters have a kubelet version that is compatible with the ECR VPC endpoint.

It seems that this repo uses unmanaged nodes though. I tried deploying it and it brought up a cluster without any nodes listed under the EKS web console. Is this correct?

@vranystepan
Copy link

@mikestef9 Thank you very much for this clue. Now I have a working setup with managed worker groups and no access to the Internet 🎉

I was not sure if it's feasible as the documentation says:

Amazon EKS managed node groups can be launched in both public and private subnets. The only requirement is for the subnets to have outbound internet access. Amazon EKS automatically associates a public IP to the instances started as part of a managed node group to ensure that these instances can successfully join a cluster.

Well, apparently it is. If someone needs working Terraform recipes, ping me stepan@vrany.dev.

@mikestef9
Copy link
Contributor

@vranystepan great to hear you have this working. As part of our fix for #607 we will make sure to get our documentation updated.

@duckie
Copy link

duckie commented Jun 18, 2020

This is still a real issue.

I need to actually create and delete new clusters from private subnets with no NAT or Egress gateways. I can create private endpoints for apparently every AWS service but EKS. This is a a deep pain for some customers, as we have to build complicated workarounds to have traffic routed towards the EKS service, whereas every other AWS service is easily exposed with a private endpoint.

@evanlurvey
Copy link

I agree with @duckie this issue should not be closed yet. EKS support is laughable.

@dsw88
Copy link

dsw88 commented Oct 27, 2020

I agree that VPC endpoints are still very important, and this issue should be kept open. It is possible to run EKS clusters in private subnets with no internet egress, but it is not possible to manage those clusters from within that private VPC. We are limited in the tooling we can develop around EKS for lifecycle actions such as creating, updating, and deleting clusters because we can't perform those actions inside our private VPC. Please consider implementing a VPC endpoint for EKS! Thanks!

@amitkarpe
Copy link

amitkarpe commented Feb 26, 2021

Hi,
Any workaround for this issue? We should able to create and manage EKS cluster in private VPC.
In our situation (due to security policies), our bastion server (and vpc) don’t have public access.
In that case how we can create an eks cluster? We are using Terraform to provision EKS.

@taro-cmd
Copy link

Is there status on this issue? This is a real problem for vendors that only use the bootstrap.sh to perform automated eks deployments because our environment are private. I would like to know if anyone is working on this eks private endpoint? Thanks

@torengaw
Copy link

torengaw commented May 7, 2021

We have the problem too. We've built a private cluster for a private vpc with CDK (the VPC is connected to a Transit Gateway). CDK makes usage of a custom resource lambda for doing the kubeconfig update. When the cluster endpointAccess is private (or public and private) this lambda is associated to the VPC (via ENIs). The Lambda function calls "aws eks update-kubeconfig" from "inside" of the VPC, but is unable to access the cluster endpoint and fails with a timeout. All necessary VPC Endpoints (according to the official EKS docs) are set (ecr.api, ecr.dkr, s3, ...,).

@xor007
Copy link

xor007 commented Sep 24, 2021

+1
Making fully private clusters that are custom cloud formation resources is actually not possible without this: a lambda in VPC cannot get kubectl tokens.

@ctrongminh
Copy link

+1
For my case,
I cannot use codebuild with attached VPC (all subnets are private) to call to the private EKS cluster via "aws eks update-kubeconfig"

The result would be
Connect timeout on endpoint URL: "https://eks.<region>.amazonaws.com/clusters/xxxxx"

@nhsk4u
Copy link

nhsk4u commented Dec 28, 2021

when i create cluster with no internet access, getting below error... Is there any update on VPC endpoint support for EKS API?

Command used to create cluster:
aws eks create-cluster
--region ap-southeast-1
--name CP-EKS-TEST-NHSK
--kubernetes-version 1.21
--role-arn arn:aws:iam::4103:role/nhsk
--resources-vpc-config subnetIds=subnet-063b9,subnet-04,securityGroupIds=sg-03

Error Message:
connect timeout on endpoint url: "https://eks.ap-southeast-1.amazonaws.com/clusters"

@laurecs
Copy link

laurecs commented Jan 28, 2022

I need this as well. Is there a solution or a current workaround yet?

@djjames72
Copy link

Commenting as well. An EKS VPC Endpoint would be a huge help. Have there been any updates recently?

@deitch
Copy link

deitch commented Jun 24, 2022

@mikestef9

If you use EKS Managed Nodes, the bootstrapping process avoids the aws eks describe-cluster API call, so you can launch workers into a private subnet without outbound internet access as long as you setup the other required PrivateLink endpoints correctly.

Mike, what are the "other required endpoints"? Is there a list somewhere that says, "here are all of the endpoints that a managed node requires"?

@Xat59
Copy link

Xat59 commented Jun 24, 2022

@mikestef9

If you use EKS Managed Nodes, the bootstrapping process avoids the aws eks describe-cluster API call, so you can launch workers into a private subnet without outbound internet access as long as you setup the other required PrivateLink endpoints correctly.

Mike, what are the "other required endpoints"? Is there a list somewhere that says, "here are all of the endpoints that a managed node requires"?

@deitch imho the folowing VPC endpoints are required :

  • ecr.api with interface mode
  • ecr.dkr with interface mode
  • s3 with gateway mode. On this point you also need to configure the a new route to join s3 via this gateway.

@deitch
Copy link

deitch commented Jun 24, 2022

Cool thanks. Are the ECR only if you use containers from ECR? Or general requirement?

This should be documented formally somewhere in AWS.

@Xat59
Copy link

Xat59 commented Jun 24, 2022

Cool thanks. Are the ECR only if you use containers from ECR? Or general requirement?

This should be documented formally somewhere in AWS.

Using EKS then ECR is required to bootstrap nodes. And because ECR stores images on S3 under-the-hood, you have to get access to S3.
You can take a look at this documentation for EKS : https://docs.aws.amazon.com/eks/latest/userguide/private-clusters.html

@deitch
Copy link

deitch commented Jun 24, 2022

Much appreciated.

@mikestef9 mikestef9 moved this from Researching to We're Working On It in containers-roadmap Jul 29, 2022
@malikdraz
Copy link

Are there any updates on this team?

@bogdando
Copy link

bogdando commented Sep 15, 2022

Cluster autoscaler, when running in a private EKS cluster, also experiences that problem:

	managed_nodegroup_cache.go:133] Failed to query the managed nodegroup foo for the cluster bar while looking for labels/taints: RequestError: send request failed
	caused by: Get "https://eks.<region>.amazonaws.com/clusters/bar/node-groups/foo": dial tcp <*public_IP*>:443: i/o timeout

After reading https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html
I think there could be a w/a to that: "DHCP options set for your VPC must include AmazonProvidedDNS in its domain name servers list". But I'm not sure which domain name to configure in dhcp options... Should it be eks.<region>.amazonaws.com?

@mikestef9 mikestef9 moved this from We're Working On It to Coming Soon in containers-roadmap Dec 5, 2022
@mikestef9
Copy link
Contributor

mikestef9 commented Dec 19, 2022

Amazon EKS now supports AWS PrivateLink for the EKS management APIs.

A few call outs:

  • VPC endpoint policies are not supported.

  • EKS support for AWS PrivateLink is available in the following AWS Regions: US East (Ohio, N. Virginia), US West (Oregon, N. California), Africa (Cape Town), Asia Pacific (Hong Kong, Mumbai, Singapore, Sydney, Seoul, Tokyo), Canada (Central), Europe (Ireland, Frankfurt, London, Stockholm, Paris, Milan), Middle East (Bahrain), South America (Sao Paulo), AWS GovCloud (US), China (Beijing), and China (Ningxia).

    • EKS API PrivateLink is not yet available in the following regions: Asia Pacific (Osaka), Asia Pacific (Jakarta), Middle East (UAE).
  • This is PrivateLink support for the EKS management APIs (createCluster etc), not the Kubernetes API endpoint of a cluster. EKS already supports a private endpoint for the Kubernetes API server, although it’s implemented in a different manner from PrivateLink (and we are aware of open feature request for the cluster private endpoint to be implemented as a standard PrivateLink endpoint).

containers-roadmap automation moved this from Coming Soon to Just Shipped Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
containers-roadmap
  
Just Shipped
Development

No branches or pull requests