利用twitter api爬取twitter数据有哪些限制?

利用twitter提供的API来爬取twitter数据都有那些限制啊?例如之前每个用户限制一个是150次可是现在貌似没有这个限制了,不过限速,小弟初学…
关注者
29
被浏览
65,765

5 个回答

Twitter API访问频次按15分钟为间隔。有两类使用池:15分钟内允许15次调用,及15分钟内允许180次调用。Twitter搜索该api属于后者,在15分钟内允许180次调用(每次调用可能包含有多条推文)。

当向Twitter发送请求后,可以通过解析响应头来获取限制信息。该信息是基于应用/用户上下文的:

  • X-Rate-Limit-Limit: 对给定请求的访问速率上限
  • X-Rate-Limit-Remaining: 15分钟时间窗中剩余请求数
  • X-Rate-Limit-Reset: 速率限制复位前(基于UTC)的剩余时间窗秒数

一旦对Twitter的请求超过了频次限制,Twitter将返回HTTP 429 “Too Many Requests”响应码及如下消息体:

{ "errors": [ { "code": 88, "message": "Rate limit exceeded" } ] }

除了通过解析响应头,还可以通过向Twitter发送rate_limit_status请求获取API访问限制信息。





看官方文档,是最正确的选择,给翻不了墙的童鞋粘帖一下原文:

API Rate Limits

Per User or Per Application

Rate limiting of the API is primarily considered on a per-user basis — or more accurately described, per access token in your control. If a method allows for 15 requests per rate limit window, then it allows you to make 15 requests per window per leveraged access token.

When using application-only authentication, rate limits are determined globally for the entire application. If a method allows for 15 requests per rate limit window, then it allows you to make 15 requests per window — on behalf of your application. This limit is considered completely separately from per-user limits.

15 Minute Windows

Rate limits are divided into 15 minute intervals. Additionally, all endpoints require authentication, so there is no concept of unauthenticated calls and rate limits.

There are two initial buckets available for GET requests: 15 calls every 15 minutes, and 180 calls every 15 minutes.


Search

Search is rate limited at 180 queries per 15 minute window.


HTTP Headers and Response Codes

Ensure that you inspect the HTTP headers, as they provide pertinent data on where your application is at for a given rate limit on the method that you just utilized.

Note that the HTTP headers are contextual. When using app-only auth, they indicate the rate limit for the application context. When using user-based auth, they indicate the rate limit for that user-application context.

  • X-Rate-Limit-Limit: the rate limit ceiling for that given request
  • X-Rate-Limit-Remaining: the number of requests left for the 15 minute window
  • X-Rate-Limit-Reset: the remaining window before the rate limit resets in UTC epoch seconds

When an application exceeds the rate limit for a given API endpoint, the Twitter API will return a HTTP 429 “Too Many Requests” response code.

If the rate limit is hit on a given endpoint, the following error will be returned:

 { "errors": [ { "code": 88, "message": "Rate limit exceeded" } ] } 

To better predict the rate limits available, consider periodically using GET application / rate_limit_status. Like the rate limiting HTTP headers, this resource’s response will indicate the rate limit status for the calling context — when using app-only auth, the limits will pertain to that auth context. When using user-based auth, the limits will pertain to the application-user context.


GET and POST Request Limits

Rate limits on “reads” from the system are defined on a per user and per application basis, while rate limits on writes into the system are defined solely at the user level. In other words, for reading rate limits consider the following scenario:

  • If user A launches application Z, and app Z makes 10 calls to user A’s mention timeline in a 15 minute window, then app Z has 5 calls left to make for that window
  • Then user A launches application X, and app X calls user A’s mention timeline 3 times, then app X has 12 calls left for that window
  • The remaining value of calls on application X is isolated from application Z’s, despite the same user A

Contrast this with write allowances, which are defined on a per user basis. So if user A ends up posting 5 Tweets with application Z, then for that same period, regardless of any other application that user A opens, those 5 POSTs will count against any other application acting on behalf of user A during that same window of time.

Lastly, there may be times in which the rate limit values that we return are inconsistent, or cases where no headers are returned at all. Perhaps memcache has been reset, or one memcache was busy so the system spoke to a different instance: the values may be inconsistent now and again. We will make a best effort to maintain consistency, but we will err toward giving an application extra calls if there is an inconsistency.


Tips to avoid being Rate Limited

The tips below are there to help you code defensively and reduce the possibility of being rate limited. Some application features that you may want to provide are simply impossible in light of rate limiting, especially around the freshness of results. If real-time information is an aim of your application, look into The Streaming APIs along with User streams.


Caching

Store API responses in your application or on your site if you expect a lot of use. For example, don’t try to call the Twitter API on every page load of your website landing page. Instead, call the API infrequently and load the response into a local cache. When users hit your website load the cached version of the results.


Prioritize active users

If your site keeps track of many Twitter users (for example, fetching their current status or statistics about their Twitter usage), consider only requesting data for users who have recently signed into your site.


Adapt to the search results

If your application monitors a high volume of search terms, query less often for searches that have no results than for those that do. By using a back-off you can keep up to date on queries that are popular but not waste cycles requesting queries that very rarely change. Alternatively, consider using the The Streaming APIs and filter on your search terms.


Use application-only auth as a “reserve”

Requests using Application-only authentication are evaluated in a separate context to an application’s per-user rate limits. For many scenarios, you may want to use this additional rate limit pool as a “reserve” for your typical user-based operations.


Blacklisting

We ask that you honor the rate limit. If you or your application abuses the rate limits we will blacklist it. If you are blacklisted you will be unable to get a response from the Twitter API. If you or your application has been blacklisted and you think there has been a mistake, you can use our Platform Support forms. So we can get you back online quickly, please include the following information:

  1. If you are using the REST API, make a call to the GET application / rate_limit_status from the account or computer which you believe to be blacklisted.
  2. Explain why you think your application was blacklisted.
  3. Describe in detail how you have fixed the problem that you think caused you to be blacklisted.

Streaming API

The Streaming API has rate limiting and access levels that are appropriate for long-lived connections. See the Streaming APIs documentation for more details.

Leveraging the Streaming API is a great way to free-up your rate limits for more inventive uses of the Twitter API.

Rate Limiting information for the Streaming API is detailed on Connecting to a streaming endpoint.


Limits Per Window Per Resource

The API rate limit window duration is 15 minutes. Visit our API Rate Limit: Chart page to see the limits by resource.

Note that endpoints/resources not listed in the above chart default to 15 requests per allotted user.

呵呵,我就想问问楼上回答的那位哥们,你是真的懂呢,还是直接百度下来复制粘贴,而且连谷歌都懒得翻?两类桶??呵呵,我开始看的时候也见有人这么翻译回答,而且有的中文文档也这么翻译,但是我看着就是不爽,桶是个什么概念能解释么?自己不太明白不要出来再误导和干扰别人了好么。虽然我也是个API的小白,但是你回答别人的问题前首先要有自己的理解吧?自己不懂就复制粘贴发出来让别人也看不懂还想充大神???呵呵。

言归正传,我的理解就是

Twitter的API的请求限制有两种方法,每15分钟15次和每15分钟180次,而这个是由请求的类型决定的!!!比如你可以1分钟获取12次twitter用户的基本信息,也就是15分钟180次。但是一分钟只能获取1次这些用户的关注者的信息,也就是15分钟15次!!!