Commit Graph

22936 Commits

Author SHA1 Message Date
Linus Torvalds
3070fb888b Merge branch 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: clkfwk: Build fix for non-legacy CPG changes.
  sh: Use GCC __builtin_prefetch() to implement prefetch().
  sh: fix vsyscall compilation due to .eh_frame issue
  sh: avoid to flush all cache in sys_cacheflush
  sh: clkfwk: Disable init clk op for non-legacy clocks.
  sh: clkfwk: Kill off now unused algo_id in set_rate op.
  sh: clkfwk: Kill off unused clk_set_rate_ex().
2010-11-25 06:58:19 +09:00
Linus Torvalds
c42978f7ec Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio: fix format of sysfs driver/vendor files
  Char: virtio_console, fix memory leak
  virtio: return correct capacity to users
  module: Update prototype for ref_module (formerly use_module)
2010-11-25 06:57:11 +09:00
Kirill A. Shutemov
112bc2e120 memcg: fix false positive VM_BUG on non-SMP
Fix this:

  kernel BUG at mm/memcontrol.c:2155!
  invalid opcode: 0000 [#1]
  last sysfs file:

  Pid: 18, comm: sh Not tainted 2.6.37-rc3 #3 /Bochs
  EIP: 0060:[<c10731b2>] EFLAGS: 00000246 CPU: 0
  EIP is at mem_cgroup_move_account+0xe2/0xf0
  EAX: 00000004 EBX: c6f931d4 ECX: c681c300 EDX: c681c000
  ESI: c681c300 EDI: ffffffea EBP: c681c000 ESP: c46f3e30
   DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
  Process sh (pid: 18, ti=c46f2000 task=c6826e60 task.ti=c46f2000)
  Stack:
   00000155 c681c000 0805f000 c46ee180 c46f3e5c c7058820 c1074d37 00000000
   08060000 c46db9a0 c46ec080 c7058820 0805f000 08060000 c46f3e98 c1074c50
   c106c75e c46f3e98 c46ec080 08060000 0805ffff c46db9a0 c46f3e98 c46e0340
  Call Trace:
   [<c1074d37>] ? mem_cgroup_move_charge_pte_range+0xe7/0x130
   [<c1074c50>] ? mem_cgroup_move_charge_pte_range+0x0/0x130
   [<c106c75e>] ? walk_page_range+0xee/0x1d0
   [<c10725d6>] ? mem_cgroup_move_task+0x66/0x90
   [<c1074c50>] ? mem_cgroup_move_charge_pte_range+0x0/0x130
   [<c1072570>] ? mem_cgroup_move_task+0x0/0x90
   [<c1042616>] ? cgroup_attach_task+0x136/0x200
   [<c1042878>] ? cgroup_tasks_write+0x48/0xc0
   [<c1041e9e>] ? cgroup_file_write+0xde/0x220
   [<c101398d>] ? do_page_fault+0x17d/0x3f0
   [<c108a79d>] ? alloc_fd+0x2d/0xd0
   [<c1041dc0>] ? cgroup_file_write+0x0/0x220
   [<c1077ba2>] ? vfs_write+0x92/0xc0
   [<c1077c81>] ? sys_write+0x41/0x70
   [<c1140e3d>] ? syscall_call+0x7/0xb
  Code: 03 00 74 09 8b 44 24 04 e8 1c f1 ff ff 89 73 04 8d 86 b0 00 00 00 b9 01 00 00 00 89 da 31 ff e8 65 f5 ff ff e9 4d ff ff ff 0f 0b <0f> 0b 0f 0b 0f 0b 90 8d b4 26 00 00 00 00 83 ec 10 8b 0d f4 e3
  EIP: [<c10731b2>] mem_cgroup_move_account+0xe2/0xf0 SS:ESP 0068:c46f3e30
  ---[ end trace 7daa1582159b6532 ]---

lock_page_cgroup and unlock_page_cgroup are implemented using
bit_spinlock.  bit_spinlock doesn't touch the bit if we are on non-SMP
machine, so we can't use the bit to check whether the lock was taken.

Let's introduce is_page_cgroup_locked based on bit_spin_is_locked instead
of PageCgroupLocked to fix it.

[akpm@linux-foundation.org: s/is_page_cgroup_locked/page_is_cgroup_locked/]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25 06:50:40 +09:00
Loïc Minier
3a3a1af37f include/linux/fs.h: fix userspace build
dpkg uses fiemap but didn't particularly need to include stdint.h so far.
Since 367a51a339 ("fs: Add FITRIM ioctl"), build of linux/fs.h failed in
dpkg with:

  In file included from ../../src/filesdb.c:27:0:
  /usr/include/linux/fs.h:37:2: error: expected specifier-qualifier-list before 'uint64_t'

Use exportable type __u64 to avoid the dependency on stdint.h.

b31d42a5af ("Fix compile brekage with !CONFIG_BLOCK") fixed only the
kernel build by including linux/types.h, but this also fixed "make
headers_check", so don't revert it.

Signed-off-by: Loïc Minier <loic.minier@linaro.org>
Tested-by: Arnd Bergmann <arnd.bergmann@linaro.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25 06:50:38 +09:00
John W. Linville
51cce8a590 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-11-24 16:49:20 -05:00
Johannes Berg
c063dbf52b cfg80211: allow using CQM event to notify packet loss
This adds the ability for drivers to use CQM events
to notify about packet loss for specific stations
(which could be the AP for the managed mode case).
Since the threshold might be determined by the
driver (it isn't passed in right now) it will be
passed out of the driver to userspace in the event.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24 16:19:36 -05:00
John W. Linville
d7a066c923 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-11-24 16:19:24 -05:00
John W. Linville
ccb1435401 Revert "nl80211/mac80211: Report signal average"
This reverts commit 86107fd170.

This patch inadvertantly changed the userland ABI.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24 16:18:36 -05:00
Tom Herbert
1d24eb4815 xps: Transmit Packet Steering
This patch implements transmit packet steering (XPS) for multiqueue
devices.  XPS selects a transmit queue during packet transmission based
on configuration.  This is done by mapping the CPU transmitting the
packet to a queue.  This is the transmit side analogue to RPS-- where
RPS is selecting a CPU based on receive queue, XPS selects a queue
based on the CPU (previously there was an XPS patch from Eric
Dumazet, but that might more appropriately be called transmit completion
steering).

Each transmit queue can be associated with a number of CPUs which will
use the queue to send packets.  This is configured as a CPU mask on a
per queue basis in:

/sys/class/net/eth<n>/queues/tx-<n>/xps_cpus

The mappings are stored per device in an inverted data structure that
maps CPUs to queues.  In the netdevice structure this is an array of
num_possible_cpu structures where each structure holds and array of
queue_indexes for queues which that CPU can use.

The benefits of XPS are improved locality in the per queue data
structures.  Also, transmit completions are more likely to be done
nearer to the sending thread, so this should promote locality back
to the socket on free (e.g. UDP).  The benefits of XPS are dependent on
cache hierarchy, application load, and other factors.  XPS would
nominally be configured so that a queue would only be shared by CPUs
which are sharing a cache, the degenerative configuration woud be that
each CPU has it's own queue.

Below are some benchmark results which show the potential benfit of
this patch.  The netperf test has 500 instances of netperf TCP_RR test
with 1 byte req. and resp.

bnx2x on 16 core AMD
   XPS (16 queues, 1 TX queue per CPU)  1234K at 100% CPU
   No XPS (16 queues)                   996K at 100% CPU

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24 11:44:20 -08:00
Tom Herbert
3853b5841c xps: Improvements in TX queue selection
In dev_pick_tx, don't do work in calculating queue
index or setting
the index in the sock unless the device has more than one queue.  This
allows the sock to be set only with a queue index of a multi-queue
device which is desirable if device are stacked like in a tunnel.

We also allow the mapping of a socket to queue to be changed.  To
maintain in order packet transmission a flag (ooo_okay) has been
added to the sk_buff structure.  If a transport layer sets this flag
on a packet, the transmit queue can be changed for the socket.
Presumably, the transport would set this if there was no possbility
of creating OOO packets (for instance, there are no packets in flight
for the socket).  This patch includes the modification in TCP output
for setting this flag.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24 11:44:19 -08:00
Eric Dumazet
456b61bca8 ipv6: mcast: RCU conversion
ipv6_sk_mc_lock rwlock becomes a spinlock.

readers (inet6_mc_check()) now takes rcu_read_lock() instead of read
lock. Writers dont need to disable BH anymore.

struct ipv6_mc_socklist objects are reclaimed after one RCU grace
period.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24 11:16:42 -08:00
Giuseppe CAVALLARO
293bb1c41b stmmac: add init/exit callback in plat_stmmacenet_data struct
This patch adds in the plat_stmmacenet_data
the init and exit callbacks that can be used
for invoking specific platform functions.
For example, on ST targets, these call the
PAD manager functions to set PIO lines and
syscfg registers.
The patch removes the stmmac_claim_resource
only used on STM Kernels as well.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24 11:14:24 -08:00
David S. Miller
66fc5dff5e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-11-24 09:16:14 -08:00
Anders Kaseorg
dfd62d1d84 module: Update prototype for ref_module (formerly use_module)
Commit 9bea7f2395 renamed use_module to
ref_module (and changed its return value), but forgot to update this
prototype in module.h.

Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-11-24 15:21:11 +10:30
Linus Torvalds
ea49b1669b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (41 commits)
  ALSA: hda - Identify more variants for ALC269
  ALSA: hda - Fix wrong ALC269 variant check
  ALSA: hda - Enable jack sense for Thinkpad Edge 11
  ALSA: Revert "ALSA: hda - Fix switching between dmic and mic using the same mux on IDT/STAC"
  ALSA: hda - Fixed ALC887-VD initial error
  ALSA: atmel - Fix the return value in error path
  ALSA: hda: Use hp-laptop quirk to enable headphones automute for Asus A52J
  ALSA: snd-atmel-abdac: test wrong variable
  ALSA: azt3328: period bug fix (for PA), add missing ACK on stop timer
  ALSA: hda: Add Samsung R720 SSID for subwoofer pin fixup
  ALSA: sound/pci/asihpi/hpioctl.c: Remove unnecessary casts of pci_get_drvdata
  ALSA: sound/core/pcm_lib.c: Remove unnecessary semicolons
  ALSA: sound/ppc: Use printf extension %pR for struct resource
  ALSA: ac97: Apply quirk for Dell Latitude D610 binding Master and Headphone controls
  ASoC: uda134x - set reg_cache_default to uda134x_reg
  ASoC: Add support for MAX98089 CODEC
  ASoC: davinci: fixes for multi-component
  ASoC: Fix register cache setup WM8994 for multi-component
  ASoC: Fix dapm_seq_compare() for multi-component
  ASoC: RX1950: Fix hw_params function
  ...
2010-11-24 08:23:56 +09:00
Linus Torvalds
3cbaa0f7a7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  of/phylib: Use device tree properties to initialize Marvell PHYs.
  phylib: Add support for Marvell 88E1149R devices.
  phylib: Use common page register definition for Marvell PHYs.
  qlge: Fix incorrect usage of module parameters and netdev msg level
  ipv6: fix missing in6_ifa_put in addrconf
  SuperH IrDA: correct Baud rate error correction
  atl1c: Fix hardware type check for enabling OTP CLK
  net: allow GFP_HIGHMEM in __vmalloc()
  bonding: change list contact to netdev@vger.kernel.org
  e1000: fix screaming IRQ
2010-11-24 08:22:34 +09:00
Daniel Klaffenbach
1d8638d403 ssb: b43-pci-bridge: Add new vendor for BCM4318
Add new vendor for Broadcom 4318.

Signed-off-by: Daniel Klaffenbach <danielklaffenbach@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-22 15:19:31 -05:00
Trond Myklebust
0b26a0bf6f NFS: Ensure we return the dirent->d_type when it is known
Store the dirent->d_type in the struct nfs_cache_array_entry so that we
can use it in getdents() calls.

This fixes a regression with the new readdir code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-11-22 13:24:48 -05:00
David Daney
90600732d8 phylib: Add support for Marvell 88E1149R devices.
The 88E1149R is 10/100/1000 quad-gigabit Ethernet PHY.  The
.config_aneg function can be shared with 88E1118, but it needs its own
.config_init.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: Cyril Chemparathy <cyril@ti.com>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-22 08:34:23 -08:00
Sridhar Samudrala
eb06acdc85 macvlan: Introduce 'passthru' mode to takeover the underlying device
With the current default 'vepa' mode, a KVM guest using virtio with
macvtap backend has the following limitations.
- cannot change/add a mac address on the guest virtio-net
- cannot create a vlan device on the guest virtio-net
- cannot enable promiscuous mode on guest virtio-net

To address these limitations, this patch introduces a new mode called
'passthru' when creating a macvlan device which allows takeover of the
underlying device and passing it to a guest using virtio with macvtap
backend.

Only one macvlan device is allowed in passthru mode and it inherits
the mac address from the underlying device and sets it in promiscuous
mode to receive and forward all the packets.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>

-------------------------------------------------------------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-22 08:24:29 -08:00
Anand Gadiyar
07a8cdd2bb usb: musb: do not use dma for control transfers
The Inventra DMA engine used with the MUSB controller in many
SoCs cannot use DMA for control transfers on EP0, but can use
DMA for all other transfers.

The USB core maps urbs for DMA if hcd->self.uses_dma is true.
(hcd->self.uses_dma is true for MUSB as well).

Split the uses_dma flag into two - one that says if the
controller needs to use PIO for control transfers, and
another which says if the controller uses DMA (for all
other transfers).

Also, populate this flag for all MUSB by default.

(Tested on OMAP3 and OMAP4 boards, with EHCI and MUSB HCDs
simultaneously in use).

Signed-off-by: Maulik Mankad <x0082077@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Cc: Oliver Neukum <oliver@neukum.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Praveena NADAHALLY <praveen.nadahally@stericsson.com>
Cc: Ajay Kumar Gupta <ajay.gupta@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2010-11-22 12:55:02 +02:00
Linus Torvalds
b86db47442 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard
  fs: Do not dispatch FITRIM through separate super_operation
  ext4: ext4_fill_super shouldn't return 0 on corruption
  jbd2: fix /proc/fs/jbd2/<dev> when using an external journal
  ext4: missing unlock in ext4_clear_request_list()
  ext4: fix setting random pages PageUptodate
2010-11-19 19:46:45 -08:00
Lukas Czerner
93bb41f4f8 fs: Do not dispatch FITRIM through separate super_operation
There was concern that FITRIM ioctl is not common enough to be included
in core vfs ioctl, as Christoph Hellwig pointed out there's no real point
in dispatching this out to a separate vector instead of just through
->ioctl.

So this commit removes ioctl_fstrim() from vfs ioctl and trim_fs
from super_operation structure.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-11-19 21:18:35 -05:00
Linus Torvalds
76db8ac45f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix readdir EOVERFLOW on 32-bit archs
  ceph: fix frag offset for non-leftmost frags
  ceph: fix dangling pointer
  ceph: explicitly specify page alignment in network messages
  ceph: make page alignment explicit in osd interface
  ceph: fix comment, remove extraneous args
  ceph: fix update of ctime from MDS
  ceph: fix version check on racing inode updates
  ceph: fix uid/gid on resent mds requests
  ceph: fix rdcache_gen usage and invalidate
  ceph: re-request max_size if cap auth changes
  ceph: only let auth caps update max_size
  ceph: fix open for write on clustered mds
  ceph: fix bad pointer dereference in ceph_fill_trace
  ceph: fix small seq message skipping
  Revert "ceph: update issue_seq on cap grant"
2010-11-19 15:32:22 -08:00
Linus Torvalds
caf8394524 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (31 commits)
  net: fix kernel-doc for sk_filter_rcu_release
  be2net: Fix to avoid firmware update when interface is not open.
  netfilter: fix IP_VS dependencies
  net: irda: irttp: sync error paths of data- and udata-requests
  ipv6: Expose reachable and retrans timer values as msecs
  ipv6: Expose IFLA_PROTINFO timer values in msecs instead of jiffies
  3c59x: fix build failure on !CONFIG_PCI
  ipg.c: remove id [SUNDANCE, 0x1021]
  net: caif: spi: fix potential NULL dereference
  ath9k_htc: Avoid setting QoS control for non-QoS frames
  net: zero kobject in rx_queue_release
  net: Fix duplicate volatile warning.
  MAINTAINERS: Add stmmac maintainer
  bonding: fix a race in IGMP handling
  cfg80211: fix can_beacon_sec_chan, reenable HT40
  gianfar: fix signedness issue
  net: bnx2x: fix error value sign
  8139cp: fix checksum broken
  r8169: fix checksum broken
  rds: Integer overflow in RDS cmsg handling
  ...
2010-11-19 15:25:59 -08:00
Ohad Ben-Cohen
ed919b0125 mmc: sdio: fix runtime PM anomalies by introducing MMC_CAP_POWER_OFF_CARD
Some board/card/host configurations are not capable of powering off the
card after boot.

To support such configurations, and to allow smoother transition to
runtime PM behavior, MMC_CAP_POWER_OFF_CARD is added, so hosts need to
explicitly indicate whether it's OK to power off their cards after boot.

SDIO core will enable runtime PM for a card only if that cap is set.
As a result, the card will be powered down after boot, and will only
be powered up again when a driver is loaded (and then it's up to the
driver to decide whether power will be kept or not).

This will prevent sdio_bus_probe() failures with setups that do not
support powering off the card.

Reported-and-tested-by: Daniel Drake <dsd@laptop.org>
Reported-and-tested-by: Arnd Hannemann <arnd@arndnet.de>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2010-11-19 17:07:01 -05:00
David S. Miller
24912420e9 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bonding/bond_main.c
	net/core/net-sysfs.c
	net/ipv6/addrconf.c
2010-11-19 13:13:47 -08:00
Eric Dumazet
93aaae2e01 filter: optimize sk_run_filter
Remove pc variable to avoid arithmetic to compute fentry at each filter
instruction. Jumps directly manipulate fentry pointer.

As the last instruction of filter[] is guaranteed to be a RETURN, and
all jumps are before the last instruction, we dont need to check filter
bounds (number of instructions in filter array) at each iteration, so we
remove it from sk_run_filter() params.

On x86_32 remove f_k var introduced in commit 57fe93b374
(filter: make sure filters dont read uninitialized memory)

Note : We could use a CONFIG_ARCH_HAS_{FEW|MANY}_REGISTERS in order to
avoid too many ifdefs in this code.

This helps compiler to use cpu registers to hold fentry and A
accumulator.

On x86_32, this saves 401 bytes, and more important, sk_run_filter()
runs much faster because less register pressure (One less conditional
branch per BPF instruction)

# size net/core/filter.o net/core/filter_pre.o
   text    data     bss     dec     hex filename
   2948       0       0    2948     b84 net/core/filter.o
   3349       0       0    3349     d15 net/core/filter_pre.o

on x86_64 :
# size net/core/filter.o net/core/filter_pre.o
   text    data     bss     dec     hex filename
   5173       0       0    5173    1435 net/core/filter.o
   5224       0       0    5224    1468 net/core/filter_pre.o

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-19 09:49:59 -08:00
Steven Rostedt
0429578016 tracing/events: Show real number in array fields
Currently we have in something like the sched_switch event:

  field:char prev_comm[TASK_COMM_LEN];	offset:12;	size:16;	signed:1;

When a userspace tool such as perf tries to parse this, the
TASK_COMM_LEN is meaningless. This is done because the TRACE_EVENT() macro
simply uses a #len to show the string of the length. When the length is
an enum, we get a string that means nothing for tools.

By adding a static buffer and a mutex to protect it, we can store the
string into that buffer with snprintf and show the actual number.
Now we get:

  field:char prev_comm[16];       offset:12;      size:16;        signed:1;

Something much more useful.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-11-19 10:18:47 -05:00
Paul Mundt
fe040be2fd Merge branch 'common/fbdev-mipi' of master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6 2010-11-19 17:04:25 +09:00
Bruno Randolf
86107fd170 nl80211/mac80211: Report signal average
Extend nl80211 to report an exponential weighted moving average (EWMA) of the
signal value. Since the signal value usually fluctuates between different
packets, an average can be more useful than the value of the last packet.

This uses the recently added generic EWMA library function.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-18 14:22:20 -05:00
Bruno Randolf
c5485a7e75 lib: Add generic exponentially weighted moving average (EWMA) function
This adds generic functions for calculating Exponentially Weighted Moving
Averages (EWMA). This implementation makes use of a structure which keeps the
EWMA parameters and a scaled up internal representation to reduce rounding
errors.

The original idea for this implementation came from the rt2x00 driver
(rt2x00link.c). I would like to use it in several places in the mac80211 and
ath5k code and I hope it can be useful in many other places in the kernel code.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-18 14:21:52 -05:00
Ingo Molnar
ae51ce9061 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2010-11-18 20:07:12 +01:00
Changli Gao
4c3710afbc net: move definitions of BPF_S_* to net/core/filter.c
BPF_S_* are used internally, should not be exposed to the others.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18 10:59:51 -08:00
Linus Torvalds
ed1d77b18c hardirq.h: needs sched.h if using BKL
This really isn't the right thing to do, and strictly speaking we should
have the BKL depth count in the thread info right next to the preempt
count.  The two really do go together.

However, since that would involve a patch to all architectures, and the
BKL is finally going away, it's simply not worth the effort to do the
RightThing(tm).  Just re-instate the <linux/sched.h> include that we
used to get accidentally from the smp_lock.h one.

This is all fallout from the same old "BKL: remove extraneous #include
<smp_lock.h>" commit.

Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-18 10:56:29 -08:00
Eric Dumazet
866f3b25a2 bonding: IGMP handling cleanup
Instead of iterating in_dev->mc_list from bonding driver, its better
to call a helper function provided by igmp.c
Details of implementation (locking) are private to igmp code.

ip_mc_rejoin_group(struct ip_mc_list *im) becomes
ip_mc_rejoin_groups(struct in_device *in_dev);

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18 09:33:19 -08:00
Frederic Weisbecker
423478cde4 tracing: Remove useless syscall ftrace_event_call declaration
It is defined right after, which makes the declaration completely
useless.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jason Baron <jbaron@redhat.com>
2010-11-18 14:37:45 +01:00
Frederic Weisbecker
53cf810b19 tracing: Allow syscall trace events for non privileged users
As for the raw syscalls events, individual syscall events won't
leak system wide information on task bound tracing. Allow non
privileged users to use them in such workflow.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jason Baron <jbaron@redhat.com>
2010-11-18 14:37:44 +01:00
Frederic Weisbecker
1ed0c59711 tracing: New macro to set up initial event flags value
This introduces the new TRACE_EVENT_FLAGS() macro in order
to set up initial event flags value.

This macro must simply follow the definition of a trace event
and take the event name and the flag value as parameters:

TRACE_EVENT(my_event, .....
....
);

TRACE_EVENT_FLAGS(my_event, 1)

This will set up 1 as the initial my_event->flags value.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jason Baron <jbaron@redhat.com>
2010-11-18 14:37:42 +01:00
Frederic Weisbecker
61c32659b1 tracing: New flag to allow non privileged users to use a trace event
This adds a new trace event internal flag that allows them to be
used in perf by non privileged users in case of task bound tracing.

This is desired for syscalls tracepoint because they don't leak
global system informations, like some other tracepoints.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jason Baron <jbaron@redhat.com>
2010-11-18 14:37:40 +01:00
Soeren Sandmann Pedersen
9c0729dc80 x86: Eliminate bp argument from the stack tracing routines
The various stack tracing routines take a 'bp' argument in which the
caller is supposed to provide the base pointer to use, or 0 if doesn't
have one. Since bp is garbage whenever CONFIG_FRAME_POINTER is not
defined, this means all callers in principle should either always pass
0, or be conditional on CONFIG_FRAME_POINTER.

However, there are only really three use cases for stack tracing:

(a) Trace the current task, including IRQ stack if any
(b) Trace the current task, but skip IRQ stack
(c) Trace some other task

In all cases, if CONFIG_FRAME_POINTER is not defined, bp should just
be 0.  If it _is_ defined, then

- in case (a) bp should be gotten directly from the CPU's register, so
  the caller should pass NULL for regs,

- in case (b) the caller should should pass the IRQ registers to
  dump_trace(),

- in case (c) bp should be gotten from the top of the task's stack, so
  the caller should pass NULL for regs.

Hence, the bp argument is not necessary because the combination of
task and regs is sufficient to determine an appropriate value for bp.

This patch introduces a new inline function stack_frame(task, regs)
that computes the desired bp. This function is then called from the
two versions of dump_stack().

Signed-off-by: Soren Sandmann <ssp@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@infradead.org>,
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>,
LKML-Reference: <m3oc9rop28.fsf@dhcp-100-3-82.bos.redhat.com>>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-11-18 14:37:34 +01:00
matthieu castet
84e1c6bb38 x86: Add RO/NX protection for loadable kernel modules
This patch is a logical extension of the protection provided by
CONFIG_DEBUG_RODATA to LKMs. The protection is provided by
splitting module_core and module_init into three logical parts
each and setting appropriate page access permissions for each
individual section:

 1. Code: RO+X
 2. RO data: RO+NX
 3. RW data: RW+NX

In order to achieve proper protection, layout_sections() have
been modified to align each of the three parts mentioned above
onto page boundary. Next, the corresponding page access
permissions are set right before successful exit from
load_module(). Further, free_module() and sys_init_module have
been modified to set module_core and module_init as RW+NX right
before calling module_free().

By default, the original section layout and access flags are
preserved. When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y,
the patch will page-align each group of sections to ensure that
each page contains only one type of content and will enforce
RO/NX for each group of pages.

  -v1: Initial proof-of-concept patch.
  -v2: The patch have been re-written to reduce the number of #ifdefs
       and to make it architecture-agnostic. Code formatting has also
       been corrected.
  -v3: Opportunistic RO/NX protection is now unconditional. Section
       page-alignment is enabled when CONFIG_DEBUG_RODATA=y.
  -v4: Removed most macros and improved coding style.
  -v5: Changed page-alignment and RO/NX section size calculation
  -v6: Fixed comments. Restricted RO/NX enforcement to x86 only
  -v7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added
       calls to set_all_modules_text_rw() and set_all_modules_text_ro()
       in ftrace
  -v8: updated for compatibility with linux 2.6.33-rc5
  -v9: coding style fixes
 -v10: more coding style fixes
 -v11: minor adjustments for -tip
 -v12: minor adjustments for v2.6.35-rc2-tip
 -v13: minor adjustments for v2.6.37-rc1-tip

Signed-off-by: Siarhei Liakh <sliakh.lkml@gmail.com>
Signed-off-by: Xuxian Jiang <jiang@cs.ncsu.edu>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Dave Jones <davej@redhat.com>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4CE2F914.9070106@free.fr>
[ minor cleanliness edits, -v14: build failure fix ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 13:32:56 +01:00
Paul Turner
a7a4f8a752 sched: Add sysctl_sched_shares_window
Introduce a new sysctl for the shares window and disambiguate it from
sched_time_avg.

A 10ms window appears to be a good compromise between accuracy and performance.

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234938.112173964@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 13:27:49 +01:00
Peter Zijlstra
2069dd75c7 sched: Rewrite tg_shares_up)
By tracking a per-cpu load-avg for each cfs_rq and folding it into a
global task_group load on each tick we can rework tg_shares_up to be
strictly per-cpu.

This should improve cpu-cgroup performance for smp systems
significantly.

[ Paul: changed to use queueing cfs_rq + bug fixes ]

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234937.580480400@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 13:27:46 +01:00
Peter Zijlstra
48c5ccae88 sched: Simplify cpu-hot-unplug task migration
While discussing the need for sched_idle_next(), Oleg remarked that
since try_to_wake_up() ensures sleeping tasks will end up running on a
sane cpu, we can do away with migrate_live_tasks().

If we then extend the existing hack of migrating current from
CPU_DYING to migrating the full rq worth of tasks from CPU_DYING, the
need for the sched_idle_next() abomination disappears as well, since
idle will be the only possible thread left after the migration thread
stops.

This greatly simplifies the hot-unplug task migration path, as can be
seen from the resulting code reduction (and about half the new lines
are comments).

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1289851597.2109.547.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 13:27:46 +01:00
Ingo Molnar
92fd4d4d67 Merge commit 'v2.6.37-rc2' into sched/core
Merge reason: Move to a .37-rc base.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 13:22:26 +01:00
Ingo Molnar
fcf48a725a Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent 2010-11-18 10:37:51 +01:00
Don Zickus
072b198a4a x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog
Now that the bulk of the old nmi_watchdog is gone, remove all
the stub variables and hooks associated with it.

This touches lots of files mainly because of how the io_apic
nmi_watchdog was implemented.  Now that the io_apic nmi_watchdog
is forever gone, remove all its fingers.

Most of this code was not being exercised by virtue of
nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to
risky here.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: fweisbec@gmail.com
Cc: gorcunov@openvz.org
LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 09:08:23 +01:00
Don Zickus
5f2b0ba4d9 x86, nmi_watchdog: Remove the old nmi_watchdog
Now that we have a new nmi_watchdog that is more generic and
sits on top of the perf subsystem, we really do not need the old
nmi_watchdog any more.

In addition, the old nmi_watchdog doesn't really work if you are
using the default clocksource, hpet.  The old nmi_watchdog code
relied on local apic interrupts to determine if the cpu is still
alive.  With hpet as the clocksource, these interrupts don't
increment any more and the old nmi_watchdog triggers false
postives.

This piece removes the old nmi_watchdog code and stubs out any
variables and functions calls.  The stubs are the same ones used
by the new nmi_watchdog code, so it should be well tested.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: fweisbec@gmail.com
Cc: gorcunov@openvz.org
LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18 09:08:23 +01:00
Linus Torvalds
0a5b871ea4 hardirq.h: remove now-empty #ifdef/#endif pair
Commit 451a3c24b0 ("BKL: remove extraneous #include <smp_lock.h>")
removed the #include line that was the only thing that was surrounded by
the #ifdef/#endif.

So now that #ifdef is guarding nothing at all. Just remove it.

Reported-by: Byeong-ryeol Kim <brofkims@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-17 18:36:25 -08:00