Add more code to IPVS to work with Netfilter connection
tracking and fix some problems.
- Allow IPVS to be compiled without connection tracking as in
2.6.35 and before. This can avoid keeping conntracks for all
IPVS connections because this costs memory. ip_vs_ftp still
depends on connection tracking and NAT as implemented for 2.6.36.
- Add sysctl var "conntrack" to enable connection tracking for
all IPVS connections. For loaded IPVS directors it needs
tuning of nf_conntrack_max limit.
- Add IP_VS_CONN_F_NFCT connection flag to request the connection
to use connection tracking. This allows user space to provide this
flag, for example, in dest->conn_flags. This can be useful to
request connection tracking per real server instead of forcing it
for all connections with the "conntrack" sysctl. This flag is
set currently only by ip_vs_ftp and of course by "conntrack" sysctl.
- Add ip_vs_nfct.c file to hold all connection tracking code,
by this way main code should not depend of netfilter conntrack
support.
- Return back the ip_vs_post_routing handler as in 2.6.35 and use
skb->ipvs_property=1 to allow IPVS to work without connection
tracking
Connection tracking:
- most of the code is already in 2.6.36-rc
- alter conntrack reply tuple for LVS-NAT connections when first packet
from client is forwarded and conntrack state is NEW or RELATED.
Additionally, alter reply for RELATED connections from real server,
again for packet in original direction.
- add IP_VS_XMIT_TUNNEL to confirm conntrack (without altering
reply) for LVS-TUN early because we want to call nf_reset. It is
needed because we add IPIP header and the original conntrack
should be preserved, not destroyed. The transmitted IPIP packets
can reuse same conntrack, so we do not set skb->ipvs_property.
- try to destroy conntrack when the IPVS connection is destroyed.
It is not fatal if conntrack disappears before that, it depends
on the used timers.
Fix problems from long time:
- add skb->ip_summed = CHECKSUM_NONE for the LVS-TUN transmitters
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The wl1271 device is using a reference clock that may change
between board to board.
Make the ref_clock parameter configurable by board settings
instead of having a hard coded value in the sources.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add a simple mechanism to pass platform data to the
SDIO instances of wl12xx.
This way there is no confusion over who owns the 'embedded data',
typechecking is preserved, and no possibility for the wrong driver to
pick up the data.
Originally proposed by Russell King.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Move wl12xx.h outside of the spi-specific location,
so it can be shared with both spi and sdio solutions.
Update all users of spi/wl12xx.h accordingly
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Acked-by: Luciano Coelho <luciano.coelho@nokia.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
As of lately, HID devices which send per-frame data split over several
HID reports have started to emerge. This patch adds a quirk which
allows the HID driver to take over the input layer synchronization,
and hence the control of the frame boundary.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The constants DCCPO_{MIN,MAX}_CCID_SPECIFIC are nowhere used in the code, but
instead for the CCID-specific options numbers are used.
This patch unifies the use of CCID-specific option numbers, by adding symbolic
names reflecting the definitions in RFC 4340, 10.3.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Previously these IDs were only used by one driver, so there was not much
need for having them generically defined. Now that this will no longer
hold true, move them over.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Move 'synced' and 'global_use' fields before 'refcount', to shrinks
struct netdev_hw_addr by 8 bytes (on 64bit arches).
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
netpoll: Disable IRQ around RCU dereference in netpoll_rx
sctp: Do not reset the packet during sctp_packet_config().
net/llc: storing negative error codes in unsigned short
MAINTAINERS: move atlx discussions to netdev
drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory
drivers/net/eql.c: prevent reading uninitialized stack memory
drivers/net/usb/hso.c: prevent reading uninitialized memory
xfrm: dont assume rcu_read_lock in xfrm_output_one()
r8169: Handle rxfifo errors on 8168 chips
3c59x: Remove atomic context inside vortex_{set|get}_wol
tcp: Prevent overzealous packetization by SWS logic.
net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS
phylib: fix PAL state machine restart on resume
net: use rcu_barrier() in rollback_registered_many
bonding: correctly process non-linear skbs
ipv4: enable getsockopt() for IP_NODEFRAG
ipv4: force_igmp_version ignored when a IGMPv3 query received
ppp: potential NULL dereference in ppp_mp_explode()
net/llc: make opt unsigned in llc_ui_setsockopt()
...
Implement flush[_delayed]_work_sync(). These are flush functions
which also make sure no CPU is still executing the target work from
earlier queueing instances. These are similar to
cancel[_delayed]_work_sync() except that the target work item is
flushed instead of cancelled.
Signed-off-by: Tejun Heo <tj@kernel.org>
Make the following cleanup changes.
* Relocate flush/cancel function prototypes and definitions.
* Relocate wait_on_cpu_work() and wait_on_work() before
try_to_grab_pending(). These will be used to implement
flush_work_sync().
* Make all flush/cancel functions return bool instead of int.
* Update wait_on_cpu_work() and wait_on_work() to return %true if they
actually waited.
* Add / update comments.
This patch doesn't cause any functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
alloc_ordered_workqueue() creates a workqueue which processes each
work itemp one by one in the queued order. This will be used to
replace create_freezeable_workqueue() and
create_singlethread_workqueue().
Signed-off-by: Tejun Heo <tj@kernel.org>
We cannot use rcu_dereference_bh safely in netpoll_rx as we may
be called with IRQs disabled. We could however simply disable
IRQs as that too causes BH to be disabled and is safe in either
case.
Thanks to John Linville for discovering this bug and providing
a patch.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool utility does not set masks for flow parameters that are
not specified, so if both value and mask are 0 then this must be
treated as equivalent to a mask with all bits set. Currently that is
done in the only driver that implements RX n-tuple filtering, ixgbe.
Move it to the ethtool core.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first version of the driver had hard-coded the logic
for handling the checksum offloading.
This was designed according to the chips included in
the STM platforms where:
o MAC10/100 supports no COE at all.
o GMAC fully supports RX/TX COE.
This is not good for other chip configurations where,
for example, the mac10/100 supports the tx csum in HW
or when the GMAC has no IPC.
Thanks to Johannes Stezenbach; he provided me a first
draft of this patch that only reviewed the IPC for the
GMAC devices.
This patch also helps on SPEAr platforms where the
MAC10/100 can perform the TX csum in HW.
Thanks to Deepak SIKRI for his support on this.
In the end, GMAC devices for STM platforms have
a bugged Jumbo frame support that needs to have
the Tx COE disabled for oversized frames (due to
limited buffer sizes). This information is also
passed through the driver's platform structure.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Deepak SIKRI <deepak.sikri@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the CSR Clock range selection.
Original patch from Johannes Stezenbach fixed the CSR
in the stmmac_mdio. We agreed to provide this through
the platform instead of.
Also thanks to Johannes for having tested it on ARM.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
A synchronous rename can be interrupted by a SIGKILL. If that happens
during a sillyrename operation, it's possible for the rename call to
be sent to the server, but the task exits before processing the
reply. If this happens, the sillyrenamed file won't get cleaned up
during nfs_dentry_iput and the server is left with a dangling .nfs* file
hanging around.
Fix this problem by turning sillyrename into an asynchronous operation
and have the task doing the sillyrename just wait on the reply. If the
task is killed before the sillyrename completes, it'll still proceed
to completion.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
...since that's where most of the sillyrenaming code lives. A comment
block is added to the beginning as well to clarify how sillyrenaming
works. Also, make nfs_async_unlink static as nfs_sillyrename is the only
caller.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Right now, v3 and v4 have their own variants. Create a standard struct
that will work for v3 and v4. v2 doesn't get anything but a simple error
and so isn't affected by this.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Each NFS version has its own version of the rename args container.
Standardize them on a common one that's identical to the one NFSv4
uses.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Remove all remaining references to the struct nameidata from the low level
NFS layers. Again pass down a partially initialised struct nfs_open_context
when we want to do atomic open+create.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Start moving the 'struct nameidata' dependent code out of the lower level
NFS code in preparation for the removal of open intents.
Instead of the struct nameidata, we pass down a partially initialised
struct nfs_open_context that will be fully initialised by the atomic open
upon success.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Replace duplicate code in NFSROOT for mounting an NFS server on '/'
with logic that uses the existing mainline text-based logic in the NFS
client.
Add documenting comments where appropriate.
Note that this means NFSROOT mounts now use the same default settings
as v2/v3 mounts done via mount(2) from user space.
vers=3,tcp,rsize=<negotiated default>,wsize=<negotiated default>
As before, however, no version/protocol negotiation with the server is
done.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
- the sync protocol supports 16 bits only, so bits 0..15 should be
used only for flags that should go to backup server, bits 16 and
above should be allocated for flags not sent to backup.
- use IP_VS_CONN_F_DEST_MASK as mask of connection flags in
destination that can be changed by user space
- allow IP_VS_CONN_F_ONE_PACKET to be set in destination
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Revert the timer per cpu-context timers because of unfortunate
nohz interaction. Fixing that would have been somewhat ugly, so
go back to driving things from the regular tick. Provide a
jiffies interval feature for people who want slower rotations.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20100917093009.519845633@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Aside from allowing software events into a !software group,
allow adding !software events to pure software groups.
Once we've moved the software group and attached the first
!software event, the group will no longer be a pure software
group and hence no longer be eligible for movement, at which
point the straight ctx comparison is correct again.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20100917093009.410784731@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
commit ab95bfe01 (net: replace hooks in __netif_receive_skb) added
rx_handler at wrong place, between two cache line aligned objects,
creating a big hole (a full cache line)
Move rx_handler and rx_handler_data before rx_queue, filling existing
hole.
Move master field in the cache line(s) used in receive path.
This saves 64 bytes (or L1_CACHE_BYTES), and avoids two possible
cache misses in receive path.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds P2P-STA and P2P-GO as device types so
we can distinguish between those and normal STA
or AP (respectively) type interfaces.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
All the blkdev_issue_* helpers can only sanely be used for synchronous
caller. To issue cache flushes or barriers asynchronously the caller needs
to set up a bio by itself with a completion callback to move the asynchronous
state machine ahead. So drop the BLKDEV_IFL_WAIT flag that is always
specified when calling blkdev_issue_* and also remove the now unused flags
argument to blkdev_issue_flush and blkdev_issue_zeroout. For
blkdev_issue_discard we need to keep it for the secure discard flag, which
gains a more descriptive name and loses the bitops vs flag confusion.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
o Actual implementation of throttling policy in block layer. Currently it
implements READ and WRITE bytes per second throttling logic. IOPS throttling
comes in later patches.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
dev->ip_ptr is protected by rtnl and rcu.
Yet some places dont use appropriate primitives and/or locking rules.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I wish we could use something cleaner, such as bind(). But that would
not work since resource subscription is orthogonal/in addition to the
normal object ID allocated via bind(). This is similar to multicasting
which also uses ioctl()'s.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We sometime want to dereference an rcu protected pointer while
holding RTNL. Use a macro to hide all lockdep details.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct ethtool_rawip4_spec and struct ethtool_ether_spec are neither
commented nor used by any driver, so remove them. Adjust padding in
the user-visible unions that included these structures.
Fix references to struct ethtool_rawip4_spec in
ethtool_get_rx_ntuple(), which should use struct ethtool_usrip4_spec.
struct ethtool_usrip4_spec cannot hold IPv6 host addresses and there
is no separate structure that can, so remove ETH_RX_NFC_IP6 and the
reference to it in niu.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are now several interfaces within the ethtool API for getting
and setting RX flow filtering and hashing behaviour, most of which are
poorly documented. This adds kernel-doc comments for all these
interfaces, based on the existing incomplete comments and on the
initial implementations.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'm reposting this patch series as v4 since there have been no additional
comments, and I cleaned up one extra bit of unneeded code (in 3/3). The patches
are against Linus's tree: 2bfc96a127
(2.6.36-rc3).
Would this patchset be suitable for inclusion in an mm branch?
This changes adds a partition_meta_info struct which itself contains a
union of structures that provide partition table specific metadata.
This change leaves the union empty. The subsequent patch includes an
implementation for CONFIG_EFI_PARTITION-based metadata.
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>